thetna
thetna

Reputation: 7143

Creating Vocabulary in python

I have a number of text file. I would like to use NLTK for preprocessing and printing the vocabulary in a plain text .text format, so that I can distribute those file for the people to use. I did following to do it.I started with taking single file:

file1 = open("path/to/text/file","rU")
raw = file1.read()
tokens = nltk.wordpunct_tokenize(raw)
words = [w.lower for w in tokens]
vocab = sorted(set(tokens))

Now i would like to list of items in vocab into a plain text .txt human readable file. How would I do it?

Upvotes: 1

Views: 6830

Answers (1)

brice
brice

Reputation: 25039

Write it out manually:

with open("output.txt", "w") as f:
    for item in vocab:
        f.write(item + "\n")

Upvotes: 4

Related Questions