Reputation: 113
I have been tasked with identifying an efficient algorithm [O(n*log(n))] that, given a set of k Strings S = {s-1, s-2, s-3, ..., s-k}, will identify the longest substring T for each pair of strings (s-i, s-j), such that T is a suffix of s-i and a prefix of s-j, as well as the longest substring T for each pair of strings (s-j, s-i). n represents the added lengths of all k strings (n = |s-1| + |s-2| + |s-3| + ... + |s-k|).
Any thoughts? A link to a solution would be fine as well. Thanks in advance!
Upvotes: 0
Views: 1476
Reputation: 22847
Algorithm 4.10 on page 61 of the book Algorithmic Aspects of Bioinformatics gives a method of computing the longest common substring of a set of given strings using suffix trees
The article also explains how finding the longest common substring is possible
in linear time with respect to the size of the suffix tree, i.e. in O(n log n).
Upvotes: 1