Chris Watts
Chris Watts

Reputation: 6715

QuickSort Stack Overflow

My generic QuickSort algorithm has two problems that I can't determine the cause of.

  1. The list is sometimes not correctly sorted
  2. When dealing with lists larger than 20,000 items, I often get stack overflows if the values repeat a lot (which I presume isn't supposed to be a real issue)

The list not being sorted correctly is a fairly rare occurrence, so the list has to be reasonably large. The paste below shows the first 13 elements of a 172-element list of installed software with the first column showing the output after the first sort, and second column for a second sort.

Adobe AIR                         7-Zip 4.65 (x64 edition)
7-Zip 4.65 (x64 edition)          Adobe AIR
Adobe Community Help              Adobe Community Help
Adobe Encore CS4 Codecs           Adobe Encore CS4 Codecs
Adobe Media Encoder CS4 Exporter  Adobe Media Encoder CS4 Exporter
Adobe Media Encoder CS4 Importer  Adobe Media Encoder CS4 Importer
Adobe Media Player                Adobe Media Player
Adobe Reader X (10.1.0)           Adobe Reader X (10.1.0)
Adobe Setup                       Adobe Setup
Adobe Setup                       Adobe Setup
Apple Application Support         Adobe Setup
Adobe Setup                       Apple Application Support
Apple Mobile Device Support       Apple Mobile Device Support
...                               ...

As you can see, the first time it was sorted, there is some incorrect behaviour, which is fixed by doing another sort.

The second problem with stack overflows happens if I sort my big-ol' Windows Event Log list. It seems to be happy if I sort 52,000 dates spanning 4 weeks; but if I sort 52,000 ID numbers which have many repeats, my stack size becomes 998 items long before it crashes the system. It also happens if I sort a 30,000 long list by a column which is mostly 'gupdate'.

Now, as far as I can see, the stack count should be log2(n) because the amount added to the stack is equal to the amount of cutting-in-half that is done.

I made my pivot random to aid this effect, but it didn't make a lot of difference.

Log2(60000) = 16. That's not enough for a stack overflow!

This is the code in concern:

private static void QuickSortFunction<T>(T[] array, int start, int end, Comparer<T> comparer)
{
    if (end - start >= 1)
    {
        int leftPtr, rightPtr, pivot;
        Random random = new Random();
        pivot = random.Next(start, end);
        Swap(array, pivot, start);
        pivot = start;

        leftPtr = start + 1;
        rightPtr = end;

        while (leftPtr < rightPtr)
        {
            while ((comparer.Compare(array[leftPtr], array[pivot])) <= 0 && (leftPtr < rightPtr))
            {
                leftPtr++;
            }

            while ((comparer.Compare(array[rightPtr], array[pivot])) >= 0 && (leftPtr <= rightPtr))
            {
                rightPtr--;
            }

            if (leftPtr < rightPtr)
            {
                Swap(array, leftPtr, rightPtr);
            }
        }
        Swap(array, pivot, rightPtr);
        pivot = rightPtr;

        QuickSortFunction(array, start, pivot - 1, comparer);
        QuickSortFunction(array, pivot + 1, end, comparer);
    }
}

private static void Swap<T>(T[] array, int pointer1, int pointer2)
{
    T temp = array[pointer1];
    array[pointer1] = array[pointer2];
    array[pointer2] = temp;
}

For anyone interested, this is the fix for the out-of-order glitch. Basically, it was failing to recognise a 2-element array when it was out of order. Such as {E, B}, which it would not change because it wouldn't look at its own pivot.

    if (end - start >= 1)
    {
        int leftPtr, rightPtr, pivot;
        Random random = new Random();
        pivot = random.Next(start, end);
        Swap(array, pivot, start);
        pivot = start;

        leftPtr = start;
        rightPtr = end;

        while (leftPtr < rightPtr)
        {
            while ((comparer.Compare(array[leftPtr], array[pivot])) < 0 && (leftPtr < rightPtr))
            {
                leftPtr++;
            }

            while ((comparer.Compare(array[rightPtr], array[pivot])) >= 0 && (leftPtr < rightPtr))
            {
                rightPtr--;
            }

            if (leftPtr < rightPtr)
            {
                Swap(array, leftPtr, rightPtr);
            }
        }
        Swap(array, pivot, rightPtr);
        pivot = rightPtr;

        QuickSortFunction(array, start, pivot - 1, comparer);
        QuickSortFunction(array, pivot + 1, end, comparer);
    }

Will update once I write a solution for the stack overflow.

Upvotes: 1

Views: 1080

Answers (3)

user2096889
user2096889

Reputation: 1

Watch the infinite loop you have created... the function must terminate for its intrinsic value to be actualized in use.

Upvotes: 0

Attila
Attila

Reputation: 28762

Let's look at the out-of-order issue first. the big loop goes until leftPtr >= rightPtr. Since your second loop tests for leftPtr <= rightPtr, it is possible that at the end leftPtr > rightPtr, in which case you are swapping (after the big loop) the pivot and an element that has been deemed "OK" (rightPtr points to something that has been passed by leftPtr)

Not sure about the stack overflow issue, but Hammar's suggestion seems reasonable especially that you said the problem occurs on large lists of many equal elements

Upvotes: 1

hammar
hammar

Reputation: 139840

Now, as far as I can see, the stack count should be log2(n) because the amount added to the stack is equal to the amount of cutting-in-half that is done.

That's only true if you split the input into halves of similar1 size. If, for example, you're sorting a list where all the items are equal, you get a very uneven split where you have no elements on one side and everything except the pivot on the other. In that case, you get O(n) stack size since each level only reduces the input size by 1.

One way of avoiding this is to use a three-way partitioning instead which places all the elements equal to the pivot in the middle.

1If the split is always better than some constant ratio, you're fine.

Upvotes: 1

Related Questions