James Shockley
James Shockley

Reputation: 1

Make a word list from any document in python

I am wanting to output a simple word list from any text document. I want every word listed but no duplicates. This is what I have but it doesn't do anything. I am fairly new to python. Thanks!

def MakeWordList():
    with open('text.txt','r') as f:
        data = f.read()
    return set([word for wordd])

Upvotes: 0

Views: 2153

Answers (2)

taras
taras

Reputation: 6914

for word in data loop basically iterates over data, which is string, so your word loop variable gets a single character in each iteration. You would want to use something like data.split() to loop over the list of words.

Upvotes: 2

KWierzbicki
KWierzbicki

Reputation: 216

You can't iterate over the data you read like this, because they are a string so as a result you get consecutive characters, however you can split the string on spaces, which will give you a list of words

def MakeWordList():
    with open('possible.rtf','r') as f:
        data = f.read()
    return set([word for word in data.split(' ') if len(word) >= 5 and word.islower() and not 'xx' in word])

Upvotes: 0

Related Questions