Wonger
Wonger

Reputation: 457

How do I use the median of an array as the pivot for quicksort

I have to write a quicksort algorithm that uses the median of the array as the pivot. From my general understanding from what I read in a book, I have to use a select algorithm in which I split the array into n/5 sub arrays, sort each of the sub arrays using insertion sort, find the median, then recursively call the select algorithm to find the median of the medians. I'm not even sure how to start this and I'm pretty confused. the selectMedian algorithm call is supposed to look something like this: SelectMedian(int first, int last, int i) where i is the ith index I want to select (in this case it would be the middle index, so array.length/2). The book I'm reading gives this description of it:

The algorithm in words (if n>1):

1. Divide n elements into groups of 5
2. Find median of each group (use insertion sort for this)
3. Use Select() recursively to find median x of the  n/5
medians
4. Partition the n elements around x.  Let k = rank(x)
5. if (i == k) then return x
if (i < k) then use Select() recursively to find i-th
smallest element in first partition else (i > k) use 
Select() recursively to find (i-k)th smallest element in 
last partition.

can anyone assist me in writing this algorithm? thanks!

Upvotes: 4

Views: 10886

Answers (3)

Avi Cohen
Avi Cohen

Reputation: 3414

QUICKSORT2(A, p, r)
if p<r
    median=floor((p + r) /2) - p + 1
    q=SELECT(A, p, r, median)
    q=PARTITION2(A, p, r, q)
    QUICKSORT2(A, p, q-1)
    QUICKSORT2(A, q+1, r)

PARTITION2(A, p, r, q)
switch between A[r] and A[q]
return PARTITION(A, p, r)

Upvotes: 0

Kane
Kane

Reputation: 4157

I assume you can figure out created n/5 sub-arrays of 5 elements each.

Finding the median of a subarray is fairly easy: you look at each element and find the element which has two smaller elements.

For example, you have 1 4 2 3 5. 1 has no smaller elements. 4 has three smaller elements. 2 has one smaller element. 3 has two smaller elements; this is the one you want.

Now you have found n/5 medians. You want to find the median of the medians, so you run the algorithm again.

Example:

1 7 2 4 9 0 3 8 5 6 1 4 7 2 3

[1 7 2 4 9][0 3 8 5 6][1 4 7 2 3]

findMedian([1 7 2 4 9]) = 4;

findMedian([0 3 8 5 6]) = 5;

findMedian([1 4 7 2 3]) = 3;

[4 5 3]

findMedian([4 5 3]) = 4;

4 is your pivot.

The reason you do this is to try and split your array evenly; if your array is split lopsided, you'll get O(N^2) performance; if your array is split evenly, you get O(NlogN) performance.

Selecting a random pivot means you could get either one - in practice it would balance out to O(NlogN) but a lot of applications want consistent performance, and random quicksort is not consistent from run to run.

The reason we use 5 (instead of 3 or 7) is because we're adding another term of time complexity searching for the median - this term has to be less than O(NlogN) but you want it to be as small as possible. Using 3 gets you O(N^2), using 5 gets you O(NlogN), and 5 is the smallest number for which this is true.

(the algorithm to find the median in linear time was given by Blum, Floyd, Pratt, Rivest and Tarjan in their 1973 paper "Time bounds for selection", and answered a famous open problem)

Upvotes: 0

AusCBloke
AusCBloke

Reputation: 18502

Would that really be necessary? Why not use the median of three where you select the pivot based on the median of three values, ie. the first, middle and last values.

Or you could even use a random pivot, which will drastically lower the chances of ending up with QuickSort's worst case time of O(N²), which may also be appropriate for your implementation.

Upvotes: 3

Related Questions