krom
krom

Reputation: 31

QuickSort slower than mergeSort

I was trying out different sorting algorithms. I was interested to really see the difference. I tried it on an array of 10 integers and everything works nicely, but of course, the run time is negligible on small amounts of data. So I tried it on 100,000 integers, and thats where I started to notice which ones are faster than the others.

I believe quicksort is known to be faster than mergesort so I was wondering what the problem in my code was. Referenced from ( https://www.youtube.com/watch?v=COk73cpQbFQ&list=PL2_aWCzGMAwKedT2KfDMB9YA5DgASZb3U&index=7 ) sorting algorithm series.

void quickSort(int num[], int start, int end) {
    if (start < end) {
        printf("\n.");
        int partitionIndex;
        partitionIndex = randomizedPartition(num, start, end);
        quickSort(num, 0, partitionIndex - 1);
        quickSort(num, partitionIndex + 1, end);
    }
}

int partition(int num[], int start, int end) {
    int partitionIndex = start;
    int pivot = end;
    int i;
    for (i = start; i < end; i++) {
        if (num[i] <= num[pivot]) {
            swap(num, i, partitionIndex);
            partitionIndex++;
        }
    }
    swap(num, partitionIndex, pivot);
    return partitionIndex;
}

int randomizedPartition(int num[], int start, int end) {
    int partitionIndex;
    int pivot = (rand() % (end - start + 1)) + start;
    swap(num, pivot, end);
    partitionIndex = partition(num, start, end);
    return partitionIndex;
}

the code above runs forever on an array of 100,000 integers while mergeSort runs for 18 seconds on my computer.

Upvotes: 2

Views: 304

Answers (3)

krom
krom

Reputation: 31

Everyone has helped very much but I would like to include the corrections I made:

First, the primary bug was that I used 0 instead of start so I replaced that. Next, I used `median-of-three'. Then, for my partitioning, thanks to @Julian, I ended up with this line of code:

int partition3(int num[], int start, int end, int pivotValue){
   int left = start;
   int right = end - 1;
   while(left <= right){
       while(num[left] <= pivotValue && left <= right) ++left;
       while(num[right] >= pivotValue && left <= right) --right;
       if(left < right) swap(num, left++, right--);
   }
   swap(num, left, end);
   return left;

}

And with that, My problem was not only solved but I was also able to improve my quickSort algorithm. Here are some references I used and am currently reading:

  1. cs.cornell.edu/courses/JavaAndDS/files/sort3Quicksort3.pdf
  2. https://www.cs.bham.ac.uk/~jxb/DSA/dsa.pdf

Thanks!

Upvotes: 1

Julian
Julian

Reputation: 4366

@sardok was right to point out a bug in your code. This is the main explanation for the speed difference. However:

I believe quicksort is known to be faster than mergesort

This is a bit of an oversimplification. Quicksort and mergesort are both optimal within their own domain: non-stable, in-place comparison sorts versus stable, copying comparison sorts, respectively.

For typical real-world data, you may very well find that quicksort performs a bit faster than mergesort, but there are definitey also datasets for which mergesort will perform better. In particular, when all keys are unique and the order of the data is fully randomized, you may find that a well-implemented mergesort is faster than a well-implemented quicksort.

I predict that when you have fixed the recursion bug, your quicksort implementation will still be a bit slower than your mergesort implementation. The reason for this is that you are generating a random number in order to select the pivot; this is an expensive operation. Using any single element from the array as the pivot (whether randomly selected or not) is also known to be suboptimal; you'll get better performance by selecting the median of the first, middle and last elements of the array (so-called median-of-three quicksort).

Furthermore, both quicksort and mergesort (but especially quicksort) are actually rather inefficient for small arrays. You'll get much better performance if you switch to insertionsort below array sizes of about 30 elements (the optimal threshold depends on the hardware and software platform, but 30 is a good ballpark threshold).

Finally, your partition algorithm can be made faster by passing the pivot as a separate value, without swapping it to the end of the array first, and by iterating from both ends:

int partition(int num[], int start, int end, int pivotValue) {
    while (start < end) {
        while (num[start] < pivotValue) ++start;
        while (num[end] > pivotValue) --end;
        swap(num, start++, end--);
    }
    return start;
}

This is faster because it avoids swapping elements that are already at the correct side of the array. Something to keep in mind, however, is that the originally selected pivot element is not necessarily at start by the end of the function; so after partitioning in this way, the element at the partitionIndex is not sorted yet. In other words, the right half of the array needs to recurse with quicksort(num, partitionIndex, end) instead of quicksort(num, partitionIndex + 1, end).

You can find much enlightenment about sorting algorithms by reading the original C++ STL implementation by Alex Stepanov.

Upvotes: 4

supercat
supercat

Reputation: 81123

Although your particular implementation has other problems discussed by other answers, it's useful to understand an important principle when judging sort performance.

Simple metrics of sorting-algorithm performance simply count the number of comparisons and swaps, on the assumption that all comparisons have equal cost, and all swaps have equal cost. Many real-world systems, however, are designed to optimize the performance of certain common sequences of memory operations; while it used to be that there might be a 2:1 or 3:1 difference in cost between accessing a favored or non-favored location, that difference has increased to be 100:1 or possibly more. Thus, sorting algorithms which access things in the optimal order may outperform those that don't, even if they end up performing more comparisons or swaps.

Upvotes: 2

Related Questions