Mike
Mike

Reputation: 85

Prefix vs Suffix Trie in String Matching

I'm not too well-versed about the actual algorithms used in string matching with tries.

I'm wondering why there seems to be more focus on suffix tries for string matching rather than prefix tries. Can we not use prefix tries for substring matching also? Put in another way, what are the advantages of suffix tries over prefix tries?

Upvotes: 5

Views: 1449

Answers (2)

templatetypedef
templatetypedef

Reputation: 372814

I think some of the confusion here arises because the term “suffix trie” means more than just “a trie that holds suffixes.” Rather, a suffix trie typically means “a prefix trie holding all the suffixes of a given string.” This contrasts with “prefix trie,” which typically stores an arbitrary collection of strings rather than all prefixes of a given string.

The reason suffix trees are useful is the following fact, sometimes called the fundamental theorem of stringology:

A string x is a substring of a string w if and only if x is a prefix of a suffix of w.

For example, “irate” is a substring of “pirates” because it’s a prefix of the suffix “irates.”

This fact is why suffix tries are so good at substring searching. Suppose you want to see whether x is a substring of w. And further suppose that, somehow, you obtained a suffix tree for w. Then you can just walk the suffix tree from the root downward and see whether you can read x without falling off the tree. If so, x is a prefix of some suffix of w, so x is a substring of w. If not, x isn’t a prefix of any suffix of w, and so x isn’t a substring of w.

As @Ed Staub’s answer shows, you could just as easily do this by using a trie that stores all the prefixes of w in reverse, then checking if x is a suffix of any prefix of w. But in practice it’s easier to think about holding all the suffixes in a prefix trie and so that’s what we do.

Upvotes: 2

Ed Staub
Ed Staub

Reputation: 15690

.retteb era seirt xiferp ,drawkcab daer uoy fI

Seriously. Suffix tries allow you to traverse from the beginning of a string.

Upvotes: 10

Related Questions