Reputation: 15063
I've heard that in quicksort it is better to call recursion on the smaller subarray first. For example if 5
was the pivot and the data gets sorted to 4,1,3,5,7,6
then it's better to sort the subarray 7,6
first because it contains two elements where as 4,1,3
contains three.
The source of all knowledge gives the pseudocode for quicksort
quicksort(A, i, k):
if i < k:
p := partition(A, i, k)
quicksort(A, i, p - 1)
quicksort(A, p + 1, k)
So an algorithm that would implement away to recurse on the smaller array first would look something like
quicksort(A, i, k):
if i < k:
p := partition(A, i, k)
if(p-i > k-p)
/*difference from start point to pivot is greater than
difference from pivot to end point*/
quicksort(A, p + 1, k)
quicksort(A, i, p - 1)
else
quicksort(A, i, p - 1)
quicksort(A, p + 1, k)
I profiled code like this written in Java and it does appear to be faster but why? At first I thought it had to do with tail recursion optimization but then realized that was definilty wrong as Java doesn't support it.
Why is sorting the code in the smaller subarray first faster? This article claims it should be
Upvotes: 6
Views: 1298
Reputation: 47992
I watched a Leslie Lamport talk recently in which he pointed out that Quicksort, as originally presented, is not necessarily a recursive algorithm, even though many programmers choose to implement it recursively.
Instead of recursing into the partitions, you can simply add them to a queue of ranges that still must be sorted. The algorithm iterates until the queue is empty. Whether you push and pull the range from the head or the tail determines whether you progress depth-first or breadth-first.
quicksort(a, begin, end) {
queue<range> q;
push q(range(begin, end));
while (!q.empty()) {
range = q.pop_front();
pivot = partition(a, range.begin, range.end);
if (pivot > range.begin) q.push_back(range(range.begin, mid));
if (pivot + 1 < range.end) q.push_back(range(mid + 1, range.end));
}
}
A recursive implementation just uses the stack as a LIFO queue, and therefore naturally proceeds in a depth-first manner, and I don't really think it matters which side you process next. The queue length is still bounded by the recursive depth.
But if you're working in a breadth first-manner, by using a FIFO queue as shown in the pseudo-code, then the order can matter. The larger partition will add more subranges to the queue, so you'd rather do that side once the queue is already as small as possible. To make the queue smaller, you want to process the smaller subrange first.
Upvotes: 0
Reputation: 4661
Maybe I'm missing something subtle here, but if you recurse on your two cases in a different order then you're just traversing the recursion tree depth-first after potentially performing some swaps of children sub-trees at each node but it's still isomorphic to the original recursion tree and the set of all recursion depths for base cases will still be the same. The only way I can see getting demonstrable recursion depth reduction or other kind of recursion-related speed-up is doing something like you recurse on the smaller subarray and then pick a pivot for the larger subarray (without recursing) and then recurse on the two subcases for the larger subarray. This will turn your recursion tree into a ternary recursion tree instead of binary, that should typically have lower maximum depth.
Upvotes: 1
Reputation: 1120
Java JIT compiler can eliminate tail recursion which leads to reduced stack memory use. Reduced stack memory may reduces cache pressure allowing larger partition size to fit to faster cache level.
Other minor improvement is reduced number of function calls. Those there is minor reduction to number of instruction executed. But the instruction count reduction has usually very low correlation to performance when operating on data that doesn't fit to cache.
Upvotes: 0