Reputation: 1657
Let's say I was building a word unscrambler with the letters a,e,l,p,p
. The number of possibilities for arrangement is huge - how do I only display those which are actual words (apple, pale etc)?
Upvotes: 5
Views: 8443
Reputation: 56
or, you can use the developer.dictionary.com api and just do a word lookup for validation. can also perform spell checks.
Upvotes: 0
Reputation: 77044
Classically word lookup problems can be efficiently solved using a Trie.
I would suggest finding a word list, say, from WordNet, store it in a Trie, and then perform fast lookups of possible words.
A solution would be of the form:
try permutations i=1..N
a. lookup permutation i using the trie
b. if there's a positive result, store this for display
c. iterate (i++)
repeat from 3.
edit:
A side note here is that for any N length character word there could be N! required lookups (for 7 characters that would be 5040). You should consider making some optimizations to the trie lookup algorithm. For instance, you gain substantial efficiency by ruling out invalid substrings early, and not repeating end permutations.
e.g. given the word apple, if you had the permutation where you selected "ppl" as the first three characters, no word will be found. So, no matter how you permute the a and the e at the end you cannot construct a word. Early termination of permutations may be important to your algorithm's efficiency.
Upvotes: 3
Reputation: 42140
I actually like zerkms's solution better but here's another one
create 2 tables
words
-----
word_id (primary key)
word
letter_index
-----
letter (idx)
word_id (idx)
When you add a word to the words table you have to add an entry to the letter_index for each unique letter. letter_index has a primary key based on both the letter and the word_id.
To find words comprising of a group of letters you create a query something like:
SELECT word FROM words w
// for each letter in the search
INNER JOIN letter_index i ON ( w.word_id = i.word_id AND i.letter = letter_1 )
INNER JOIN letter_index i ON ( w.word_id = i.word_id AND i.letter = letter_2 )
INNER JOIN letter_index i ON ( w.word_id = i.word_id AND i.letter = letter_3 )
...
INNER JOIN letter_index i ON ( w.word_id = i.word_id AND i.letter = letter_n )
Upvotes: 0
Reputation: 53940
you can also consider pspell
http://php.net/manual/en/book.pspell.php
$ps = pspell_new("en");
foreach(array('alppe', 'plape', 'apple') as $word)
if(pspell_check($ps, $word))
echo $word;
Upvotes: 2
Reputation: 254924
Ah, and the another answer:
If you just want to get all real words - then find any big dictionary. then store it in manner of:
word | hash
where word is word itself and hash is sorted alphabetically letters:
for apple hash will be: aelpp or aelp2
then for given letters traverse all combinations using the same algo for hashing and search through this table.
Upvotes: 3
Reputation: 1023
Store a list of words in a file or a database, and then just try all the combinations. You could also consider the likely position of vowels vs consonants to potentially speed it up. Rather than making your own word list, you could use something like WordNet.
Upvotes: 0