Reputation: 31

shortest uncommon subsequence

Given two strings s and t, determine length of shortest string z such that z is a subsequence of s and not a subsequence of t.

example :

s :babab, t :babba

sol : 3 (aab)

not looking for copy pastable code, please if anybody can help with intution for solving this.

thanks a lot !

Upvotes: 3

Answers (1)

kcsquared

Reputation: 5409

You can use dynamic programming to solve this in quadratic time just like the longest common subsequence. I'll show the formula and how you would come up with it.

First, some definitions. Let m be the length of S, and let n be the length of T. Define DP(i, j) as the length of the shortest subsequence of S[:i] that is not a subsequence of T[:j], or 'INF' if none exists. Here, the expression S[:i] is slice notation meaning 'the first i characters of S, so S[:0] is empty and S[:m] is all of S. We want DP(m, n).

There's two easy base cases: Since T[:0] is empty, any character in S will work, so DP(i, 0) = 1 if i > 0. Similarly, DP(0, j) = 'INF' for all j.

Now, we just have to write a general formula for DP(i, j) which only depends on the value of DP() on indices smaller than (i, j). The last character of S[:i] is just some character S[i-1]. Either our optimal subsequence for S[:i], T[:j] ends with S[i-1] or it doesn't.

If our optimal subsequence doesn't end with S[i-1], then we can delete S[i-1] from consideration, and our answer is DP(i, j) = DP(i-1, j).

If our optimal subsequence does end with S[i-1], then we need to know the rightmost occurrence of S[i-1] in T[:j].

If S[i-1] does not occur in T[:j] at all, then S[i-1] by itself is a shortest subsequence, so DP(i, j) = 1.
Otherwise, let Rightmost(c, j) be the rightmost index of T[:j] equal to some character c. Since we are using S[i-1] to end our optimal subsequence, we can ignore all the characters in T[:j] after the rightmost occurrence of S[i-1]: they can no longer affect whether a string is a subsequence of T[:j]. So then DP(i, j) = DP(i-1, Rightmost(S[i-1], j)) + 1, where the +1 comes from the fact that we did choose to use S[i-1].

Putting those together, the general formula for i, j > 0 becomes:

DP(i, j) = 1 if (S[i-1] not in T[:j]), or 

         = min(DP(i-1, j),
               DP(i-1, Rightmost(S[i-1], j)) + 1) otherwise.

Since Rightmost(c, j) is always less than j by definition, we've achieved a formula using only indices smaller (lexicographically) than (i, j), and we can use that formula directly for a recursive algorithm.

Upvotes: 1

shortest uncommon subsequence

Answers (1)

Related Questions