Shmwel
Shmwel

Reputation: 1697

QuickSort best case is worst than average case

I have an annoying issue with quicksort. So, I have to study performance (in operations) of quicksort in Best, Average and Worst case.

The operations consist in : comparisons + attributions.

Currently I test quicksort in this cases like (100 up to 10.000 elements array). The problem comes up when I test it and I get the following results (e.g. 100 elements array) :

Best Case: aprox. 4853 operations

Average Case: aprox. 1468 operations

Worst Case: aprox. 9024 operations

The theory says QuickSort has efficiency of O(n*log n) in both best & average case. As you can see, I get a totally different result which violates the theory.

(I'm using Profiler as a custom library to generate random array. The last parameter of FillRandomArray method is order (0 - no ordered, 1 - ascending order, 2- descending order)).

Here is the code I use:

#include <iostream>
#include <stdio.h>
#include <conio.h>
#include <string.h>
#include "Profiler.h"

#define MIN_SIZE 100
#define MAX_SIZE 10000


struct sortingAlg{
        std::string type;
        int atributions;
        int comparisons;
};

int partition(int *givenArray, int p, int r, sortingAlg& sortingAlgoritm)
{
        int x = givenArray[r];
        int i = p - 1;
        for (int j = p; j <= r - 1; ++j)
        {
                sortingAlgoritm.comparisons += 1;
                if (givenArray[j] <= x)
                {
                        sortingAlgoritm.atributions += 2;
                        i += 1;
                        int aux = givenArray[i];
                        givenArray[i] = givenArray[j];
                        givenArray[j] = aux;
                }
        }

        sortingAlgoritm.atributions += 2;
        givenArray[r] = givenArray[i + 1];
        givenArray[i + 1] = x;
        return i + 1;
}

void quicksort(int *givenArray, int beginning, int length, sortingAlg& sortingAlgoritm)
{
        if (beginning < length)
        {
                int q = partition(givenArray, beginning, length, sortingAlgoritm);
                quicksort(givenArray, beginning, q-1,  sortingAlgoritm);
                quicksort(givenArray, q + 1, length, sortingAlgoritm);
        }
}

int main()
{
        Profiler profiler("heapProfiler");

        sortingAlg sortingAlgs[2];
        sortingAlgs[0].type = "HS";
        sortingAlgs[0].atributions = 0;
        sortingAlgs[0].comparisons = 0;

        sortingAlgs[1].type = "QS";
        sortingAlgs[1].atributions = 0;
        sortingAlgs[1].comparisons = 0;


        for (int i = MIN_SIZE; i <= MAX_SIZE; i += 100)
        {
                std::cout << "Sorting array for " << i << " elements.." << std::endl;


                sortingAlgs[1].atributions = 0;
                sortingAlgs[1].comparisons = 0;

                int *avg =  new int[i];
                FillRandomArray(avg, i, 0, 1000, false, 0);
                quicksort(avg, 1, i, sortingAlgs[1]);

                profiler.countOperation("AVG_QuickSort_ALL", i, sortingAlgs[1].atributions + sortingAlgs[1].comparisons);
                profiler.createGroup("AVG_QuickSort", "AVG_QuickSort_ALL");

                sortingAlgs[1].atributions = 0;
                sortingAlgs[1].comparisons = 0;

                int *best =  new int[i];
                FillRandomArray(best, i, 0, 1000, false, 1);
                quicksort(best, 1, i, sortingAlgs[1]);

                profiler.countOperation("BEST_QuickSort_ALL", i, sortingAlgs[1].atributions + sortingAlgs[1].comparisons);
                profiler.createGroup("BEST_QuickSort", "BEST_QuickSort_ALL");

                sortingAlgs[1].atributions = 0;
                sortingAlgs[1].comparisons = 0;


                int *worst = new int[i];
                FillRandomArray(worst, i, 0, 1000, false, 2);
                quicksort(worst, 1, i, sortingAlgs[1]);

                profiler.countOperation("WORST_QuickSort_ALL", i, sortingAlgs[1].atributions + sortingAlgs[1].comparisons);
                profiler.createGroup("WORST_QuickSort", "WORST_QuickSort_ALL");
        }
        std::cout << "Building complete...! Creating profiler groups... Opnening reports!" << std::endl;
        profiler.showReport();


        return 0;
}

Any ideas? Thanks.

Upvotes: 0

Views: 1220

Answers (2)

Jerry Coffin
Jerry Coffin

Reputation: 490573

The short answer is that it looks like you're not choosing the pivot correctly for in-order to be (even close to) the best case. In fact, given how you seem to be choosing the pivot, I'm surprised that sorting in-order data isn't even worse than you're showing.

For in-order data to be the best case, you want to choose the pivot as the element at the middle of the portion you're currently partitioning. In this case, you won't have to move any elements to do the partition.

As an aside: IMO, your code is unnecessarily difficult to read. Just for example, p and r are pretty close to meaningless as parameter names. Better names would help tremendously in deciphering your code. LIkewise, unless you have a very specific reason to do otherwise, I'd also consider replacing your:

                    int aux = givenArray[i];
                    givenArray[i] = givenArray[j];
                    givenArray[j] = aux;

with something like:

using std::swap;
// ...


                   swap(givenArray[i], givenArray[j]);

This is not only more readable, but potentially more efficient for code that works with elements of some type other than int, for which the most efficient swap may not be to copy entire elements.

Personally, if I wanted to profile counts of comparisons and assignments as you have, I'd do it rather differently: I'd define a type that kept track of the comparisons and assignments for that type:

template <class T>
class counted {
    static size_t comparisons;
    static size_t assignments;
    T val;
public:
    counted(T val) : val(val) {}
    bool operator<(counted c) {
        ++comparisons;
        return val < c.val;
    }

    counted &operator=(counted &other) { 
        ++assignments;
        val = other.val;
        return *this;
    }
    static void reset() { 
        assignments = 0;
        comparisons = 0;
    }
    std::pair<size_t, size_t> counts() { 
        return std::make_pair(assignments, comparisons); 
    }
};

Then the sorting code would just do sorting, and to profile the sorting code, you'd just pass an array (or preferably, vector) of this type that handles the profiling. Once the sorting is done, you can retrieve the counts from that type, reset the counts, and do the next test. This way, you can profile almost any sorting code, without having to rewrite the sorting code to do profiling (e.g., if you wanted to compare your quicksort to std::sort for various input orders, you could pretty easily do so).

Upvotes: 2

libik
libik

Reputation: 23049

I think there is a problem when you choose pivot.

For "best case" scenario, you should have a "best pivot" choosed, but you are not doing it. If you always choose pivot as a number in a middle, it would work.

Upvotes: 2

Related Questions