user2836553
user2836553

Reputation: 79

Shortest Common Superstring algorithm?

I'm trying to code this problem here: enter image description here

but I'd like to find an algorithm that breaks down the steps for solving the problem. I can't seem to find anything too useful online so I've come here to ask if anyone knows of a resource which I can use to refer to an algorithm that solves this problem.

Upvotes: 3

Views: 5632

Answers (2)

Niema Moshiri
Niema Moshiri

Reputation: 937

I agree with Terry Li: it is only NP-complete to find the SCS of multiple sequences. For 2 sequences (say s is of length n and t is of length m), my solution (doesn't use LCS but uses something similar) is done in O(nm) time:

1) Run a global alignment, in which you disallow mismatches, don't penalize indels, and give a positive score to matches (I did +1 for matches, -10 for mismatches, and 0 for indels, but these can be adjusted). (This is O(nm))

2) Iterate over the global alignment for both output strings v and w. If v[i] isn't a gap, output it. Otherwise, output w[i]. (This is O(n+m)).

Upvotes: -1

Terry Li
Terry Li

Reputation: 17268

This is called the shortest common supersequence problem. The idea is that in order for the supersequence to be the shortest, we want to find as many shared bits of a and b as possible. We can solve the problem in two steps:

  1. Find the longest common subsequence of a and b.

  2. Insert the remaining bits of a and b while preserving the order of these bits.

We can solve the longest common subsequence problem using dynamic programming.

Upvotes: 6

Related Questions