user667967
user667967

Reputation: 648

Search Algorithm to find the k lowest values in a list (selection algorithm/problem)

I have a list that contains n double values and I need to find the k lowest double values in that list

What algorithm would you recommend?

At the moment I use Quicksort to sort the whole list, and then I take the first k elements out of the sorted list. I expect there should be a much faster algorithm.

Thank you for your help!!!

Upvotes: 6

Views: 4046

Answers (5)

rkp
rkp

Reputation: 13

you can use selection sort, it takes O(n) to select first lowest value. Once we have set this lowest value on position 1 we can rescan the data set to find out second lowest value. and can do it until we have kth lowest value. in this way if k is enough smaller then n then we will have complexity kn which is equivalent to O(n)...

Upvotes: 0

Raymond Hettinger
Raymond Hettinger

Reputation: 226256

You could model your solution to match the nlargest() code in Python's standard library.

  • Heapify the first k values on a maxheap.
  • Iterate over the remaining n - k values.
  • Compare each to the element of the top of the heap.
  • If the new value is lower, do a heapreplace operation (which replaces the topmost heap element with the new value and then sifts it downward).

The algorithm can be surprisingly efficient. For example, when n=100,000 and k=100, the number of comparisons is typically around 106,000 for randomly arranged inputs. This is only slightly more than 100,000 comparisons to find a single minimum value. And, it does about twenty times fewer comparisons than a full quicksort on the whole dataset.

The relative strength of various algorithms is studied and summarized at: http://code.activestate.com/recipes/577573-compare-algorithms-for-heapqsmallest

Upvotes: 11

amit
amit

Reputation: 178431

You can use selection algorithm to find the kth lowest element and then iterate and return it and all elements that are lower then it. More work has to be done if the list can contain duplicates (making sure you don't end up with more elements that you need).
This solution is O(n). Selection algorithm is implemented in C++ as nth_element()

Another alternative is to use a max heap of size k, and iterate the elements while maintaining the heap to hold all k smallest elements.

for each element x:
   if (heap.size() < k):
      heap.add(x)
   else if x < heap.max():
      heap.pop()
      heap.add(x)

When you are done - the heap contains k smallest elements.
This solution is O(nlogk)

Upvotes: 9

juanchopanza
juanchopanza

Reputation: 227390

You can use std::nth_element. This is O(N) complexity because it doesn't sort the elements, it just arranges them such that every element under a certain N is less than N.

Upvotes: 3

Asha
Asha

Reputation: 11232

Take a look at partial_sort algorithm from C++ standard library.

Upvotes: 2

Related Questions