Reputation: 227
How to find largest COMMON substring among two Strings?
Upvotes: 4
Views: 2813
Reputation: 1
Basically two ways: 1.dynamic programming, which cost O(mn) time and O(mn) space. "templatetypedef" has answered this. 2. suffix tree, by this you need to build it first, the building process with suffix link is O(m+n) (time and space), if without suffix link is O((m+n)^2) (time). Though building the suffix tree process is of same efficiency with dynamic programming in best case, however, after you have built it, you could get the longest common substring in O(k) time (k is the length of the LCS).
Upvotes: 0
Reputation: 372704
I think you can solve this using a pretty elegant dynamic programming algorithm that runs in O(mn) time and O(mn) space, where m and n are the number of characters in each string. The idea is based on the following recurrence. Let the two strings be A = a0 a1 a2 ... an-1 and B = b0 b1 b2 ... bm-1 and look at their first characters a0 and b0. Then there are three ways you can try to find the longest common subsequence:
This gives us a very nice recurrence:
LCS(A[0 .. n], B[0 .. m]) = longest of {
A[0] + LCS(A[1 .. n], B[1 .. m]) (if A[0] == B[0]),
LCS(A[1 .. n], B[0 .. m]),
LCS(A[0 .. n], B[1 .. m])
}
As our base cases, the longest common substring of any string and the empty string is the empty string:
LCS (A[n .. n], B[i, m]) = ""
LCS (A[i .. n], B[m, m]) = ""
This definition of the longest common substring allows you to compute the value of LCS(A[i .. n], B[j .. m]) given the three values LCS(A[i + 1 .. n], B[j + 1 .. m]), LCS(A[i .. n], B[j + 1 .. m]), and LCS(A[i + 1 .. n], B[j .. m]). Consequently, if you compute these values in the right order, you can just populate a table with the results in one pass and construct the result from there. Using some standard DP tricks, this can be made to run in O(mn).
Upvotes: 4
Reputation: 29401
Probably using a Suffix Tree. Create trees for both strings, then use those structures to look for paths that are common.
Upvotes: 5