Reputation: 67829
According to cppreference.com, the complexity of the C++ STL sorting algorithms is:
sort
: O(N log(N))
partial_sort
: "approximately" O(N log(M)), where M is distance(middle-first)
nth_element
: "on average" O(N)
However, this seems to imply that, instead of doing a partial_sort
, you could use nth_element
and then sort the first range, to give an overall complexity of O(N + M log(M)), which is a bit better than O(N log(M)). Is this actually true? Am I better off avoiding partial_sort
?
Upvotes: 24
Views: 5897
Reputation: 659
std::partial_sort
would perform partial sort for the M elements you are interested in. On the other hand std::nth_element
would only give you an array, such that nth element is placed such that all elements on the left are smaller and on the right are greater.
Use std::partial_sort
for use cases such as, getting top 10 results out of a million in order of rank. Use std::nth_element
for finding the median of an array, or to find out who stood 10th in exam results.
If you are just interested in the performance characteristics of both, for smaller values of M, std::partial_sort
would perform better than std::nth_element
(about 10,000) . For a detailed analysis of this, see: https://www.youtube.com/watch?v=-0tO3Eni2uo
std::nth_element
uses modified Quickselect, which provides O(N) complexity regardless of M.
std::partial_sort
uses Heapselect, which provides better performance than Quickselect for small M. As a side effect, the end state of Heapselect leaves you with a heap, which means that you get the first half of the Heapsort algorithm "for free".
std::partial_sort
is optimized for the case where M is a small constant relative to N. For example, taking the top 10 items from a very large variable-length list. It is not optimized for the other cases.
In a race between std::partial_sort
and std::nth_element
+ std::sort
, std::partial_sort
jumps out to an early lead (small M) but is overtaken by std::nth_element
+ std::sort
once M is no longer small.
Upvotes: 30
Reputation: 67829
After extensive testing, it seems that partial_sort
is faster for my use-case. I suspected as much -- but this seems to confirm it.
Upvotes: 1