Reputation: 145
There are 1 billion integers stored in a file. One line per integer. Memory can support loading of 1 million integers at a time. We need to display 100 largest integers.
My thoughts :
Upvotes: 1
Views: 209
Reputation: 7385
You just need to iterate over the file once:
Edit: Inserting a new number into the top 100 is O(n) if you use a sorted list and O(log(n)) if you use a heap. So if the performance of the process depends on the insertion it makes sense to use a heap. If it depends mostly on reading the file then it doesn't matter.
Upvotes: 2
Reputation: 80197
Build min-heap for the first 100 elements.
For every new element check - if it larger than heap top, remove top, insert new element.
Heap size always is 100.
So overall complexity is O(N * log(100)) = O(N)
(in common case of k top values - O(N log k))
Million is used just as max size of block that you read from file, then walk through it.
Upvotes: 4