Reputation: 375
I am looking for an efficient algorithm to do the following.
Given a table/dictionary of letter counts
counts = {'a': 3, 'b': 2, 'd': 2, 'e': 2, 'k':2, 'r': 2 }
and a list of M words
words = ['break, 'add', 'build' ... ]
write a function that finds N words using all the letters (and words from the list)
>>> find_words(counts, words, 3)
['break', 'break', 'add']
The naive approach of going over all the words N times is too slow (I think it's O(m^n)).
Upvotes: 2
Views: 98
Reputation: 46497
This problem has a strongly NP-complete feeling to it. I therefore would expect there to be no truly efficient solution.
I would therefore suggest that you try reducing it to a known problem and solve that instead. In particular you can convert this to an integer linear programming problem. Solvers for those are often surprisingly good. And per Python Mixed Integer Linear Programming there are a number that are available to you relatively easily.
The specific problem that you are trying to solve is this. You are trying to find a vector of m
integers x_i
such that 0 ≤ x_i
, sum(x_i) ≤ n
, and for each letter l
, the sum of how many times the letter has been used does not exceed your count. The objective function that you would like to maximize is the sum of x_i * (1 + len(w_i))
where w_i
is the i
'th word.
This will always produce an answer. The answer that you get will represent a solution to your problem if and only if the objective function is n
plus the count of letters available. If the objective function is less than that, then no solution exists.
Upvotes: 1