Reputation: 1
I am trying to write a program to create a 2gb (approximately) sized file of English words. And from this 2gb file trying to print the frequency of words using external sorting. After external sorting it can just print the count(frequency)
Upvotes: 0
Views: 830
Reputation: 123772
Python has a built-in function sorted
which sorts an iterable. But even better than that, in versions 2.7 and greater it has a built-in collection for counting the frequencies of things. Assuming your large file has one word per line, you can do:
from collections import Counter
with open(<giant-dictionary>) as words:
counts = Counter(words)
This will take a few minutes.
Upvotes: 3