barmanthewise
barmanthewise

Reputation: 377

Word decomposition algorithm

Let A be a finite alphabet and A* the set of all the finite strigns you can form from that alphabet. You are then given a set F of some strings from A* (tipically a lot smaller than A*) that contains the alphabet itself (A), and a string S beloging to A*.

Is there a polynomial time algorithm to find the minimum decomposition of S in terms of strings that belong to F?

By decomposition I mean a writing of S = F1...Fk of concatenated strings Fj that belong to F, and you are trying to minimize the number k.

Upvotes: 3

Views: 263

Answers (1)

Peter de Rivaz
Peter de Rivaz

Reputation: 33499

We can solve this in polynomial time using dynamic programming as follows.

Let D[i] be the minimum decomposition of the first i letters i of your string S.

If n is the length of S, and f is the total length of all strings in F, we can compute D in time O(n.f) by iterating over i.

For each value of i we try all choices in F for the end of the string and pick the one that results in the minimum decomposition.

Once we have done this, the answer to the original question is contained in D[n].

Python code

F=['abcd','de','a','b','c','d']
S='abcde'

D = [ [] ]
for i in range(1,len(S)+1):
    s = S[:i]
    best = None
    for f in F:
        if s.endswith(f):
            a = D[i-len(f)]
            if a is None:
                continue
            if best is None or len(a) < len(best):
                best = a + [f]
    D.append(best)

print D[len(S)]

prints:

['a', 'b', 'c', 'de']

Upvotes: 3

Related Questions