Reputation: 105
I try to write a python script that searches a txt file (english dictionary) for anagrams. I have those three functions:
def is_anagram(a,b):
a_ = list(a)
a_.sort()
b_ = list(b)
b_.sort()
if a_ == b_ and a != b:
return True
else:
return False
def find_anagrams(word,t):
_res=[word]
for line in t:
check = line.strip()
if is_anagram(check,word):
_res += [check]
return _res
def find_all_anagrams(f):
res = {}
void = []
for line in f:
word = line.strip()
_list = list(word)
_list.sort()
key = tuple(''.join(_list))
if key not in res and key not in void:
if find_anagrams(word,f) == []:
void += [key]
res[key] = find_anagrams(word,f)
return res
If i call the find_all_anagrams function with:
fin = open ('words.txt')
print find_all_anagrams(fin)
The program stops after the first loop and just gives me
{('a', 'a'): ['aa']}
Why does it not continue and process the second line of words.txt? Btw the words.txt file is the one from Moby Project that can be downloaded here(http://thinkpython.com/code/words.txt)
Upvotes: 0
Views: 1063
Reputation: 17263
When you call find_all_anagrams
it will read the first line from file. Then it will call find_anagrams
which will read the rest of the file. When the for
loop in find_all_anagrams
tries to pull next line from the file there's nothing more to read so it returns with the result generated so far.
Even if you'd change your program so that find_all_anagrams
would continue from the following line it would be horribly slow because the time complexity is O(n^2). Instead you could read the file once and store the words to dictionary where key is the sorted word and value is a list of words:
from collections import defaultdict
def key(word):
return ''.join(sorted(word))
d = defaultdict(list)
with open('words.txt') as f:
for line in f:
line = line.strip()
d[key(line)].append(line)
print d[key('dog')]
Output:
['dog', 'god']
Upvotes: 2
Reputation: 2503
From within find_all_anagrams(f)
you then pass f
to find_anagrams(word,f)
. In find_anagrams
it then iterates over all the lines of the file on the line for line in t:
By the time it returns to find_all_anagrams, it's already read the entire file, and there is nothing left to read.
Upvotes: 0