Reputation: 1
I am wanting to output a simple word list from any text document. I want every word listed but no duplicates. This is what I have but it doesn't do anything. I am fairly new to python. Thanks!
def MakeWordList():
with open('text.txt','r') as f:
data = f.read()
return set([word for wordd])
Upvotes: 0
Views: 2153
Reputation: 6914
for word in data
loop basically iterates over data
, which is string, so your word
loop variable gets a single character in each iteration. You would want to use something like data.split()
to loop over the list of words.
Upvotes: 2
Reputation: 216
You can't iterate over the data you read like this, because they are a string so as a result you get consecutive characters, however you can split the string on spaces, which will give you a list of words
def MakeWordList():
with open('possible.rtf','r') as f:
data = f.read()
return set([word for word in data.split(' ') if len(word) >= 5 and word.islower() and not 'xx' in word])
Upvotes: 0