Source1
Source2
Source2
Suffix Tree Applications
Suffix Trees can be used to solve a large number of string problems that occur in text-editing, free-text search, computational biology, and other application areas. Some examples are given below.
String Search
Searching for a substring,
pat[1..m]
, in txt[1..n]
, can be solved in O(m) time (after the suffix tree for txt
has been built in O(n) time).Longest Repeated Substring
Add a special ``end of string'' character, e.g. `$', to i.e., `issi' in the case of `mississippi'. The longest repeated substring can be found in O(n) time using a suffix tree.
txt[1..n]
and build a suffix tree; the longest repeated substring of txt[1..n]
is indicated by the deepest fork node in the suffix tree, where depth is measured by the number of characters traversed from the root,Longest Common Substring
The longest common substring of two strings,
txt1
and txt2
, can be found by building a generalized suffix tree for txt1
and txt2
: Each node is marked to indicate if it represents a suffix of txt1
or txt2
or both. The deepest node marked for both txt1
and txt2
represents the longest common substring.Equivalently, one can build a (basic) suffix tree for the string
(Try it using the HTML FORM above.)
txt1$txt2#
, where `$' is a special terminator for txt1
and `#' is a special terminator for txt2
. The longest common substring is indicated by the deepest fork node that has both `...$...' and `...#...' (no $) beneath it.(Try it using the HTML FORM above.)
Note that the `longest common substring problem' is different to the `longest common subsequence problem' which is closely related to the `edit-distance problem': An instance of a subsequence can have gaps where it appears in
txt1
and in txt2
, but an instance of a substringcannot have gaps.Palindromes
A palindrome is a string, P, such that P=reverse(P). e.g. `abba'=reverse(`abba'). e.g. `ississi' is the longest palindrome in `mississippi'.
The longest palindrome of
(Try it.)
txt[1..n]
can be found in O(n) time, e.g. by building the suffix tree for txt$reverse(txt)#
or by building the generalized suffix tree for txt
and reverse(txt)
.(Try it.)
No comments:
Post a Comment