Aditya Joshee
Aditya Joshee

Reputation: 145

Most efficient way to display top 100 integers from a file containing 1 billion integers. Memory can hold at max 1 million integers

There are 1 billion integers stored in a file. One line per integer. Memory can support loading of 1 million integers at a time. We need to display 100 largest integers.

My thoughts :

  1. Use a max heap data structure of size 100.
  2. Take 1st million integers from the file and put in the heap.

Upvotes: 1

Views: 209

Answers (2)

fafl
fafl

Reputation: 7385

You just need to iterate over the file once:

  • Have an ordered list of top 100 integers
  • Iterate over the file: If a number is big enough, put it in top 100

Edit: Inserting a new number into the top 100 is O(n) if you use a sorted list and O(log(n)) if you use a heap. So if the performance of the process depends on the insertion it makes sense to use a heap. If it depends mostly on reading the file then it doesn't matter.

Upvotes: 2

MBo
MBo

Reputation: 80197

Build min-heap for the first 100 elements.

For every new element check - if it larger than heap top, remove top, insert new element.

Heap size always is 100. So overall complexity is O(N * log(100)) = O(N)
(in common case of k top values - O(N log k))

Million is used just as max size of block that you read from file, then walk through it.

Upvotes: 4

Related Questions