Reputation: 1460
I'm having trouble coming up with a non-brute force approach to solve this problem I've been wondering: what set of N letters can be used to make the most words from a given dictionary? Letters can be used any number of times.
For example, for N=3, we can have EST to give words like TEST and SEE, etc...
Searching online, I found some answers (such as listed above for EST), but no description of the approach.
My question is: what well-known problems are similar to this, or what principles should I use to tackle this problem?
NOTE: I know it's not necessarily true that if EST is the best for N=3, then ESTx is the best for N=4. That is to say, you can't just append a letter to the previous solution.
In case you're wondering, this question came to mind because I was wondering what set of 4 ingredients could make the most cocktails, and I started searching for that. Then I realized my question was specific, and so I figured this letter question is the same type of problem, and started searching for it as well.
Upvotes: 4
Views: 323
Reputation: 1460
So I coded up a solution to this problem that uses a hash table to get things done. I had to deal with a few problems along the way too!
N
be the size of the group of letters you are looking for that can make the most words. Let L
be the length of the dictionary.'test' -> {'e','s','t'}
You go through the dictionary 1+2N times for steps 1-5, step 6 goes through a version of the dictionary and check (2^N)-1 subsets each time (ignore null set). That gives O(2NL + L*2^N) which should approach O(L*2^N). Not bad, since N will not be too big in most applications!
Upvotes: 0
Reputation: 6737
For each word in dictionary, sort it letters and remove duplicates. Let it be the skeleton of the word. For each skeleton, count how many words contain it. Let it be its frequency. Ignore all skeletons whose size is higher than N.
Let a subskeleton be any possible removals of 1 or more letters from the skeleton, i.e. EST has subskeletons of E,S,T,ES,ET,ST. For each skeleton of size N, add the count of this skeleton and all its subskeletons. Select the skeleton with maximal sum.
You need O(2**N*D) operations, where D is size of the dictionary.
Correction: we need to take into account all skeletons of size up to N (not only of words), and the numbet of operations will be O(2**N*C(L,N)), where L is the number of letters (maybe 26 in english).
Upvotes: 2