Reputation: 43
So I tried my best to optimize my Quicksort algorithm to run as efficiently as possible, even for sorted or nearly sorted arrays, using a pivot that is the median of three values, and also using insertion sort for small partition sizes. I have tested my code for large arrays of random values and it works, but when I pass an already sorted array I get a stack overflow error(ironic because it led me to find this website). I believe this to be a problem with my recursive calls(I know that the partitioning works for other data sets at least), but I can't quite see what to change.
This is part of my first semester data structures class so any code review will help as well. Thanks.
public void quickSort(ArrayList<String> data, int firstIndex, int numberToSort) {
if (firstIndex < (firstIndex + numberToSort - 1))
if (numberToSort < 16) {
insertionSort(data, firstIndex, numberToSort);
} else {
int pivot = partition(data, firstIndex, numberToSort);
int leftSegmentSize = pivot - firstIndex;
int rightSegmentSize = numberToSort - leftSegmentSize - 1;
quickSort(data, firstIndex, leftSegmentSize);
quickSort(data, pivot + 1, rightSegmentSize);
}
}
public int partition(ArrayList<String> data, int firstIndex, int numberToPartition) {
int tooBigNdx = firstIndex + 1;
int tooSmallNdx = firstIndex + numberToPartition - 1;
String string1 = data.get(firstIndex);
String string2 = data.get((firstIndex + (numberToPartition - 1)) / 2);
String string3 = data.get(firstIndex + numberToPartition - 1);
ArrayList<String> randomStrings = new ArrayList<String>();
randomStrings.add(string1);
randomStrings.add(string2);
randomStrings.add(string3);
Collections.sort(randomStrings);
String pivot = randomStrings.get(1);
if (pivot == string2) {
Collections.swap(data, firstIndex, (firstIndex + (numberToPartition - 1)) / 2);
}
if (pivot == string3) {
Collections.swap(data, firstIndex, firstIndex + numberToPartition - 1);
}
while (tooBigNdx < tooSmallNdx) {
while ((tooBigNdx < tooSmallNdx) && (data.get(tooBigNdx).compareTo(pivot) <= 0)) {
tooBigNdx++;
}
while ((tooSmallNdx > firstIndex) && (data.get(tooSmallNdx).compareTo(pivot) > 0)) {
tooSmallNdx--;
}
if (tooBigNdx < tooSmallNdx) {// swap
Collections.swap(data, tooSmallNdx, tooBigNdx);
}
}
if (pivot.compareTo(data.get(tooSmallNdx)) >= 0) {
Collections.swap(data, firstIndex, tooSmallNdx);
return tooSmallNdx;
} else {
return firstIndex;
}
}
Upvotes: 3
Views: 364
Reputation: 10565
You can avoid stack overflows without changing your algorithm too much. The trick is to tail-call optimize on the largest partition and only use recursion on the smallest one. This usually means your have to change your if
to a while
. I can't really test java code right now, but it should look something like:
public void quickSort(ArrayList<String> data, int firstIndex, int numberToSort) {
while (firstIndex < (firstIndex + numberToSort - 1))
if (numberToSort < 16) {
insertionSort(data, firstIndex, numberToSort);
} else {
int pivot = partition(data, firstIndex, numberToSort);
int leftSegmentSize = pivot - firstIndex;
int rightSegmentSize = numberToSort - leftSegmentSize - 1;
//only use recursion for the smallest partition
if (leftSegmentSize < rightSegmentSize) {
quickSort(data, firstIndex, leftSegmentSize);
firstIndex = pivot + 1;
numberToSort = rightSegmentSize;
} else {
quickSort(data, pivot + 1, rightSegmentSize);
numberToSort = leftSegmentSize;
}
}
}
This ensures that the call stack size will be at most O(log n)
, because on each call you only use recursion on an array of at most n/2
size.
Upvotes: 2
Reputation: 82461
In your partition
method you sometimes use a element outside the range:
String string1 = data.get(firstIndex);
String string2 = data.get((firstIndex + (numberToPartition - 1)) / 2);
String string3 = data.get(firstIndex + numberToPartition - 1);
(firstIndex + (numberToPartition - 1)) / 2
is not index of the middle element. That would be (firstIndex + (firstIndex + (numberToPartition - 1))) / 2
= firstIndex + ((numberToPartition - 1) / 2)
.
In fact if firstIndex > n/2
(where n
is the number of elements in the input) you're using a element with an index smaller than firstIndex
. For sorted arrays that means you choose the element at firstIndex
as pivot element. Therefore you get a recursion depth in
,
which causes the stack overflow for large enough inputs.
Upvotes: 1