Reputation: 6715
My generic QuickSort algorithm has two problems that I can't determine the cause of.
The list not being sorted correctly is a fairly rare occurrence, so the list has to be reasonably large. The paste below shows the first 13 elements of a 172-element list of installed software with the first column showing the output after the first sort, and second column for a second sort.
Adobe AIR 7-Zip 4.65 (x64 edition)
7-Zip 4.65 (x64 edition) Adobe AIR
Adobe Community Help Adobe Community Help
Adobe Encore CS4 Codecs Adobe Encore CS4 Codecs
Adobe Media Encoder CS4 Exporter Adobe Media Encoder CS4 Exporter
Adobe Media Encoder CS4 Importer Adobe Media Encoder CS4 Importer
Adobe Media Player Adobe Media Player
Adobe Reader X (10.1.0) Adobe Reader X (10.1.0)
Adobe Setup Adobe Setup
Adobe Setup Adobe Setup
Apple Application Support Adobe Setup
Adobe Setup Apple Application Support
Apple Mobile Device Support Apple Mobile Device Support
... ...
As you can see, the first time it was sorted, there is some incorrect behaviour, which is fixed by doing another sort.
The second problem with stack overflows happens if I sort my big-ol' Windows Event Log list. It seems to be happy if I sort 52,000 dates spanning 4 weeks; but if I sort 52,000 ID numbers which have many repeats, my stack size becomes 998 items long before it crashes the system. It also happens if I sort a 30,000 long list by a column which is mostly 'gupdate'.
Now, as far as I can see, the stack count should be log2(n) because the amount added to the stack is equal to the amount of cutting-in-half that is done.
I made my pivot random to aid this effect, but it didn't make a lot of difference.
Log2(60000) = 16. That's not enough for a stack overflow!
This is the code in concern:
private static void QuickSortFunction<T>(T[] array, int start, int end, Comparer<T> comparer)
{
if (end - start >= 1)
{
int leftPtr, rightPtr, pivot;
Random random = new Random();
pivot = random.Next(start, end);
Swap(array, pivot, start);
pivot = start;
leftPtr = start + 1;
rightPtr = end;
while (leftPtr < rightPtr)
{
while ((comparer.Compare(array[leftPtr], array[pivot])) <= 0 && (leftPtr < rightPtr))
{
leftPtr++;
}
while ((comparer.Compare(array[rightPtr], array[pivot])) >= 0 && (leftPtr <= rightPtr))
{
rightPtr--;
}
if (leftPtr < rightPtr)
{
Swap(array, leftPtr, rightPtr);
}
}
Swap(array, pivot, rightPtr);
pivot = rightPtr;
QuickSortFunction(array, start, pivot - 1, comparer);
QuickSortFunction(array, pivot + 1, end, comparer);
}
}
private static void Swap<T>(T[] array, int pointer1, int pointer2)
{
T temp = array[pointer1];
array[pointer1] = array[pointer2];
array[pointer2] = temp;
}
For anyone interested, this is the fix for the out-of-order glitch. Basically, it was failing to recognise a 2-element array when it was out of order. Such as {E, B}, which it would not change because it wouldn't look at its own pivot.
if (end - start >= 1)
{
int leftPtr, rightPtr, pivot;
Random random = new Random();
pivot = random.Next(start, end);
Swap(array, pivot, start);
pivot = start;
leftPtr = start;
rightPtr = end;
while (leftPtr < rightPtr)
{
while ((comparer.Compare(array[leftPtr], array[pivot])) < 0 && (leftPtr < rightPtr))
{
leftPtr++;
}
while ((comparer.Compare(array[rightPtr], array[pivot])) >= 0 && (leftPtr < rightPtr))
{
rightPtr--;
}
if (leftPtr < rightPtr)
{
Swap(array, leftPtr, rightPtr);
}
}
Swap(array, pivot, rightPtr);
pivot = rightPtr;
QuickSortFunction(array, start, pivot - 1, comparer);
QuickSortFunction(array, pivot + 1, end, comparer);
}
Will update once I write a solution for the stack overflow.
Upvotes: 1
Views: 1080
Reputation: 1
Watch the infinite loop you have created... the function must terminate for its intrinsic value to be actualized in use.
Upvotes: 0
Reputation: 28762
Let's look at the out-of-order issue first. the big loop goes until leftPtr >= rightPtr
. Since your second loop tests for leftPtr <= rightPtr
, it is possible that at the end leftPtr > rightPtr
, in which case you are swapping (after the big loop) the pivot and an element that has been deemed "OK" (rightPtr
points to something that has been passed by leftPtr
)
Not sure about the stack overflow issue, but Hammar's suggestion seems reasonable especially that you said the problem occurs on large lists of many equal elements
Upvotes: 1
Reputation: 139840
Now, as far as I can see, the stack count should be log2(n) because the amount added to the stack is equal to the amount of cutting-in-half that is done.
That's only true if you split the input into halves of similar1 size. If, for example, you're sorting a list where all the items are equal, you get a very uneven split where you have no elements on one side and everything except the pivot on the other. In that case, you get O(n) stack size since each level only reduces the input size by 1.
One way of avoiding this is to use a three-way partitioning instead which places all the elements equal to the pivot in the middle.
1If the split is always better than some constant ratio, you're fine.
Upvotes: 1