Reputation: 271
So I have a text file and I need to sort the lines alphabetically. Example input:
This is the first sentence
A sentence here as well
But how do I reorder them?
Output:
A sentence here as well
But how do I reorder them?
This is the first sentence
Here's the thing: This file is so large, I don't have enough RAM to actually split it into a list/array. I tried to use Python's built-in sorted() function and the process got killed.
To give you an idea:
wc -l data
21788172 data
Upvotes: 6
Views: 1538
Reputation: 11236
Similarly to what Hugh recommended (but different in that this isn't a pure-Python solution), you could sort the file in chunks. E.g., split the file into 26 other files--A.txt, B.txt, C.txt, etc. Sort each of those individually and then combine them to get the final result.
Main thing to keep in mind is that the first pass through the source file is merely to divvy up the lines to their constituent first letters. Only after that do you run the sorts through each file. A simple cat A.txt B.txt ...
will handle the rest.
Upvotes: 1
Reputation: 56634
It sounds like you need to do a merge-sort: divide the file into blocks, sort each block, then merge the sorted blocks back together. See Python class to merge sorted files, how can this be improved?
Upvotes: 5