prism
prism

Reputation: 83

Find the maximum number of words (from a given dictionary) that collectively represent a given text

Input: Text T and a set of n words over a finite alphabet.

We have to find the longest representation of words that when combined make up T. This can be done by merging words together.

For instance, given the output:

words ={ "na", "ba", "banana", "bana", "a", "nan"}

T= "banana"

The output should be "ba""nan""a" because the number of words is 3. "bana""na" and "banana" consist of 2 and 1 word/s respectively, therefore this is not the maximum number of words, and the output should be "ba""nan""a".

I have tried to solve this with a greedy algorithm, but it does not work. So, I think that the solution is dynamic programming, but I don't know how. I tried another approach, to do this with a Trie - by searching for the current letter in the vertices of the trie.

Is this an optimal, or even proper solution? If not, what is the dynamic programming solution?

Thank you in advance!

Upvotes: 2

Views: 779

Answers (1)

xashru
xashru

Reputation: 3590

You can solve this using dynamic programming. Here is a solution in python using top-down approach with memoization

memo = {} # memoization
def find_max(words, t, pos):
    if pos in memo:
        return memo[pos]
    ret = 0
    for i in range(pos+1, 1+len(t)):
        sub = t[pos:i]
        if sub in words:
            ret = max(ret, 1 + find_max(words, t, i))
    memo[pos] = ret
    return ret

words = ["na", "ba", "banana", "bana", "a", "nan"]
t = "banana"
ret = find_max(words, t, 0)   # returns 3

You can optimize the for loop for finding the matched words starting from pos in a more efficient way. Trie can be used for this.

Upvotes: 1

Related Questions