Reputation: 7143
I have a number of text file. I would like to use NLTK for preprocessing and printing the vocabulary in a plain text .text format, so that I can distribute those file for the people to use. I did following to do it.I started with taking single file:
file1 = open("path/to/text/file","rU")
raw = file1.read()
tokens = nltk.wordpunct_tokenize(raw)
words = [w.lower for w in tokens]
vocab = sorted(set(tokens))
Now i would like to list of items in vocab into a plain text .txt
human
readable file. How would I do it?
Upvotes: 1
Views: 6830
Reputation: 25039
Write it out manually:
with open("output.txt", "w") as f:
for item in vocab:
f.write(item + "\n")
Upvotes: 4