Reputation:
If I have a relation (SQL) that does not fit in memory and I want to sort the relation using TPMMS (Two-pass multi-way merge sort method). How would I divide the table in sub tables (and how many) that can fit in memory and than merge them? Let's say I am using C#.
Upvotes: 4
Views: 7576
Reputation: 754530
I've not hunted down the current definition of a two-pass multi-way merge sort, but the theory of 'external sorting' (where the data is too big to fit in memory) is pretty much standard. Any decent book on algorithms will cover it; amongst many others, you could look at Knuth, Sedgewick, or (for the software archaeologists) Kernighan & Plauger Software Tools.
The basic technique is simple:
All dreadfully standard - but there are nit-picky details to get right.
There was a good article in AT&T's Unix System Readings Volume II called 'Theory and Practice in the Construction of a Working Sort Routine' that you should find and read if you are serious about learning how to handle external sorts. However, when you read it, remember that machines have changed dramatically since it was written, with gigabytes of main memory (instead of megabytes) and terabytes of disk space (or SSD — also instead of megabytes).
Upvotes: 9