Reputation: 81
I'm trying to output every word that appears in my tokens more than 1000 times (> 1000) and save it to freq1000.
freq1000 = []
newtokens = []
for words in tokens:
newtokens += words
FreqDist(newtokens)
fd_1 = FreqDist(newtokens)
for i in set(fd_1):
if fd_1.count(i) == >1000:
print(i)
This is my current code, I'm completly stuck after this and I'm not sure if there is a freqdist function I can use to help. I have saved the FreqDist to fd_1 successfully. I'm just unsure how to get an output of the words that appear more than 1000 times and save it to freq1000.
I would appreciate any help you can provide.
Upvotes: 2
Views: 1850
Reputation: 4370
You can filter the words based on the frequency count using the freqDist.items()
like below:
list(filter(lambda x: x[1]>=1000, fd_1.items()))
Hope it helps :)
Upvotes: 1