bisuke
bisuke

Reputation: 331

How to check for words in a string even if the order is different - PYTHON

I am trying to find make as many words with e.g. 'workbook' So the results should be like: work, workbook, book, bookwork, bow, row etc.

This is one method I tried, but this won't find words that are spelled in different order. (e.g. it won't append 'bow' even though you could rearrange letters within 'workbook' to write 'bow')

f = open('/usr/share/dict/words', 'r')

test = "workbook"
anagramlist = [] 

for word in f:
    if word[:-1] in test and len(word[:-1]) > 2: 
        anagramlist.append(word[:-1]) 
        # this wont append 'bookwork', 'row' etc 

print anagramlist #outputs ['boo', 'book', 'work', 'workbook'] 

Another method I tried approaching this problem is by using sets.. But this doesn't work entirely either because it appends words that for e.g have more than 1 'w's like 'wow' or 'wowwow', even though I want it to only use the number of letters and letters in 'workbook'

f = open('/usr/share/dict/words', 'r')
test = "workbook"
anagramlist = []

for word in f:
    if len(word) > 2 and set(word[:-1]) == set(test) & set(word[:-1]): 
        anagramlist.append(word[:-1])

print anagramlist

the output for this one is. I'm hoping I can fix something in the condition, or maybe this is a completely wrong approach.

['bo', 'bob', 'bobo', 'boo', 'boob', 'boobook', 'book', 'bookwork', 'boor', 'bor', 'boro', 'borrow', 'bow', 'bowk', 'bowwow', 'brob', 'broo', 'brook', 'brow', 'ko', 'kob', 'koko', 'kor', 'or', 'orb', 'ow', 'owk', 'rob', 'rook', 'row', 'wo', 'wob', 'woo', 'work', 'workbook', 'wow', 'wro']

I would really appreciate your help.

Upvotes: 1

Views: 243

Answers (2)

snakile
snakile

Reputation: 54521

First generate all the potential anagrams by calculating all the word permutations and iterating over all the possible anagram lengths. Then filter potential_anagrams according to your words file f.

import itertools

def compute_anagrams(word)
    n = len(word) + 1
    permutations = {''.join(p) for p in itertools.permutations(word)}
    potential_anagrams = {p[:i] for i in range(n) for p in permutations}
    return [anagram for anagram in potential_anagrams if anagram in f]

Deomonstration:

>>> f = ['book', 'bookwork', 'bow', 'row', 'work', 'workbook']
>>> word = 'workbook'
>>> compute_anagrams(words)
['work', 'bow', 'workbook', 'row', 'bookwork', 'book']

Upvotes: 2

Steve Jessop
Steve Jessop

Reputation: 279315

You additionally need to test that for each letter in the dictionary word, it does not appear more times in the dictionary word than it does in "workbook". You could do this for example using the method count() of str.

Of course there are other approaches that in the end might be more efficient, but it's not necessary to start from scratch in order to fix what you have.

Upvotes: 1

Related Questions