Xarn
Xarn

Reputation: 3561

Dual pivot quicksort in face of expensive swaps

Moved this question over to Programmers, as it didn't seem theoretical enough for CS.
TLDR
Has anyone tested dual pivot quicksort performance with expensive-to-swap elements? It seems that in this case, it should massively underperform standard quicksort.


Backstory
Inspired by recent "question" here on stack overflow, I decided to go and implement non trivial versions of given sorts (introsort, quicksort with 3-way partition, median of 3 pivot selection, small block insertion sort etc).

During some research I also came upon dual pivot quicksort, which is the current implementation of quicksort in Java standard library. Generally it claims that it is always at least as good as standard quicksort, and empirical testing seemed to support it. (Which is the reason it is the current implementation.)

However, it seems that no STL implementation uses dual pivot quicksort for the quicksort phase of introsort, which made me wonder why. After more research I found this paper. It says that while dual pivot quicksort performs on average 5% less comparisons, it performs significantly more swaps. (Approximately 80% more) Obviously, since Java has only primitives and reference types, swapping is always cheap. (Even so, it uses this sort only for primitives, because it is not stable)

So I wanted to see whether someone already tested standard quicksort vs dual pivot quicksort when elements are expensive to swap and has the numbers (and possibly source) lying around, or whether I will have to test this myself.

This question is specifically about quick sort variants.

Upvotes: 2

Views: 967

Answers (3)

S0lo
S0lo

Reputation: 36

I have actually studied this extensively in my paper. https://arxiv.org/ftp/arxiv/papers/1505/1505.00558.pdf

The short answer is, NO. Dual Pivot doesn't perform as well when compared to a high end version of quicksort when swapping large elements. Look at figures 22 and 23.

Upvotes: 2

If your goal is to reduce the number of swaps, you should fall back to sorting pointers. Something like this:

void sort(vector<BigClass> &data) {
    //First get an array of pointers.
    vector<BigClass*> temp;
    temp.reserve(data.size());
    for(BigClass& current : data) temp.push_back(&current);

    //Sort.
    sort(temp.begin(), temp.end(), [](const BigClass* &a, const BigClass* &b){
        /* some lambda to compare the pointers by the objects they point to */
    });

    //Move the objects.
    vector<BigClass> result;
    result.reserve(data.size());
    for(BigClass* current : temp) result.push_back(*current);

    //Update the argument
    swap(result, data);
}

This is guaranteed to do precisely data.size() copy constructions. You can't do better.

Upvotes: 0

Leorge Takeuchi
Leorge Takeuchi

Reputation: 94

I think that there is 3 reasons why QUICKSORTYAROSLAVSKIY() of the paper is not fast.

  1. It is difficult to get a good pivot. Often 2 pivots are too near or too far.
  2. There is too many swap. For example, swapping in the line 12 is a partial array sliding.
  3. The insertion sort implemented is not practical. The Knuth algorithm of the paper might be written to teach. I don't like a partial array sliding.

A merit of QUICKSORTYAROSLAVSKIY() is that 2 pivots are excluded from partitioning.

I devised an algorithm to make quicksort faster. Because, swapping is a redundant method and a pivot should be chosen in various elements while N is big. More..

Upvotes: 0

Related Questions