Rohan
Rohan

Reputation: 1657

How to build a "word unscrambler"?

Let's say I was building a word unscrambler with the letters a,e,l,p,p. The number of possibilities for arrangement is huge - how do I only display those which are actual words (apple, pale etc)?

Upvotes: 5

Views: 8443

Answers (6)

hookedonit
hookedonit

Reputation: 56

or, you can use the developer.dictionary.com api and just do a word lookup for validation. can also perform spell checks.

Upvotes: 0

Mark Elliot
Mark Elliot

Reputation: 77044

Classically word lookup problems can be efficiently solved using a Trie.

I would suggest finding a word list, say, from WordNet, store it in a Trie, and then perform fast lookups of possible words.

A solution would be of the form:

  1. load the word list
  2. store the word list in a trie
  3. accept input for a word to unscramble
  4. try permutations i=1..N

    a. lookup permutation i using the trie

    b. if there's a positive result, store this for display

    c. iterate (i++)

  5. repeat from 3.

edit:

A side note here is that for any N length character word there could be N! required lookups (for 7 characters that would be 5040). You should consider making some optimizations to the trie lookup algorithm. For instance, you gain substantial efficiency by ruling out invalid substrings early, and not repeating end permutations.

e.g. given the word apple, if you had the permutation where you selected "ppl" as the first three characters, no word will be found. So, no matter how you permute the a and the e at the end you cannot construct a word. Early termination of permutations may be important to your algorithm's efficiency.

Upvotes: 3

meouw
meouw

Reputation: 42140

I actually like zerkms's solution better but here's another one

create 2 tables

words
-----
word_id (primary key)
word


letter_index
-----
letter (idx)
word_id (idx)

When you add a word to the words table you have to add an entry to the letter_index for each unique letter. letter_index has a primary key based on both the letter and the word_id.
To find words comprising of a group of letters you create a query something like:

SELECT word FROM words w
// for each letter in the search
INNER JOIN letter_index i ON ( w.word_id = i.word_id AND i.letter = letter_1 )
INNER JOIN letter_index i ON ( w.word_id = i.word_id AND i.letter = letter_2 )
INNER JOIN letter_index i ON ( w.word_id = i.word_id AND i.letter = letter_3 )
...
INNER JOIN letter_index i ON ( w.word_id = i.word_id AND i.letter = letter_n )

Upvotes: 0

user187291
user187291

Reputation: 53940

you can also consider pspell

http://php.net/manual/en/book.pspell.php

$ps = pspell_new("en");
foreach(array('alppe', 'plape', 'apple') as $word)
   if(pspell_check($ps, $word))
      echo $word;

Upvotes: 2

zerkms
zerkms

Reputation: 254924

Ah, and the another answer:

If you just want to get all real words - then find any big dictionary. then store it in manner of:

word | hash

where word is word itself and hash is sorted alphabetically letters:

for apple hash will be: aelpp or aelp2

then for given letters traverse all combinations using the same algo for hashing and search through this table.

Upvotes: 3

Stephen Cross
Stephen Cross

Reputation: 1023

Store a list of words in a file or a database, and then just try all the combinations. You could also consider the likely position of vowels vs consonants to potentially speed it up. Rather than making your own word list, you could use something like WordNet.

Upvotes: 0

Related Questions