Wifi
Wifi

Reputation: 11

Find the number of unique longest common subsequences

For 2 strings, I want to find the number of distinct LCS's. I read on wiki on how to print all LCS's but how to check that they are distinct? The hash table is not feasible as my input string each can be 1500-2000 characters long so maximum number of LCS's can be 2000 choose 1000

Upvotes: 1

Views: 2492

Answers (3)

doodhwala
doodhwala

Reputation: 358

Once you find each subsequence, insert them into a lazy version of a trie.

Trie suffers from the issue of wastage of memory. So instead of storing the values till the end, only branch out when it is necessary to resolve conflicts.

Eg. anna, apps, anne

Initially, the root node will have anna in it.

When you try to insert apps, you realise that there is already a string at the root and hence create a branch into a and try to put anna and apps. The conflict remains till you split into anna and apps.

Currently, the trie will look like:

                                    a
                           (anna) n   p (apps)

Now when you insert anne, you will reach an and realise there is a conflict and resolve it by adding a n branch followed by a and e branches.

Final trie will look like:

                                    a
                                  n   p (apps)
                                n
                       (anna) a  e (anne)

Upvotes: 2

David Lehavi
David Lehavi

Reputation: 1198

Throw the two strings to a suffix tree. This is time and space linear in length of the concatenation of the two strings.

Upvotes: 0

Johannes Hoff
Johannes Hoff

Reputation: 3901

You can use a hash table, but instead of storing the whole substring, you just store (a list of) the beginning and end of it relative to the original string. This way, you can do a string compare in the original string if there are any collisions.

Upvotes: 1

Related Questions