Reputation: 68738
Consider a dynamic programming problem that asks how many distinct subsequences (not necessarily contiguous) of a sequence S have a certain property P of value p0.
The range of P is small and finite, and there is an efficient way of calculating P:
P(s1 + s2) = f(P(s1), P(s2))
where +
denotes sequence concatenation.
One way to do this would be to count how many subsequences there are of the prefix S[1] + S[2] + ... + S[k]
of S that have property px. (Store this in Count[px][k]
)
So the recursion is:
Count[px][k] = Count[px][k-1] // not using element S[k];
P pq = f(px,P(S[k])); // calculate property pq of appending element S[k]
Count[pq][k] += Count[px][k-1] // add count of P(prefix+S[k])
and the answer is then:
return Count[p0][S.length]
This works when the elements of S are pairwise distinct, however it will count twice subsequences that have equal value but use different elements of different positions.
How can I modify this algorithm so that it counts equal subsequences exactly once ? (ie only counts distinct subsequences)
Upvotes: 6
Views: 1789
Reputation: 33519
Suppose the sequence is of characters and S[k] is the letter x.
The problem is that you have double counted all sequences that don't use S[k], but nevertheless end with x (this can only happen if x happened earlier in the sequence).
Suppose the last time x appeared was at element S[j]. All the distinct sequences that end with x is simply given by counting all distinct sequences up to position j-1, and then adding x onto all of these.
We can therefore correct for the double counting by subtracting this count.
Count[px][k] = Count[px][k-1] // not using element S[k]
P pq = f(px,P(S[k])) // property pq of appending element S[k]
j = index of previous location in string where S[j]==S[k]
Count[pq][k] += Count[px][k-1] // using element S[k]
Count[pq][k] -= Count[px][j-1] // Removing double counts
Upvotes: 3