Anthony Kong
Anthony Kong

Reputation: 40834

Doing QuickSort for fun, but there is something I do not understand

Just want to ensure you that it is not a homework question. I am doing it just for some fun and probably to blog about.

I am implementation the 'in-place' version of Quick Sort in accordance to this wiki page.

Here is the code I have written:

# solution 2

def swap(arr, left, right):
    try:
        tmp = arr[left]
        arr[left] = arr[right]
        arr[right] = tmp
    except IndexError:
        print "!IndexError: left, right", left, right
        raise

def partition(arr, left, right, pindex):
    pivot = arr[pindex]
    swap(arr, right, pindex)
    npindex = left
    for i in range(left, right+1): # +1 so the last item is included
        if arr[i] < pivot:
            swap(arr, i, npindex)   
            npindex += 1
    swap(arr, npindex, right)
    return npindex

def qs(arr):
    qs2(arr, 0, len(arr)-1)
    return arr

def qs2(arr, left, right):
    if left < right:
        # midpoint = (right - left) / 2
        midpoint = left # this works.
        nindex = partition(arr, left, right, midpoint)
        qs2(arr, left, nindex-1)
        qs2(arr, nindex+1, right)

def test():
    test = [23, 45, 12, 1, 3, -9, 67, 122, 8, 2, 0]
    res = qs(list(test))
    print "start\t", test
    print "end\t", res

if __name__ == "__main__":
    test() 

What I do not understand is, according to the wikipedia, the choice of pivot (variable midpoint inside function qs2) can be random as long as it is within the bounds.

However, if I just take the midpoint (i.e. (left + right) /2 ), the quicksort does not work. Using the left value, it works.

Did I miss something in understanding the algorithm?

EDIT: I just realised there is a tag of 'quicksort' in stackoverflow. If this question has been asked before, my apology and please let me know. I will close this question. Meanwhile I am going through those questions with quicksort tag

Upvotes: 2

Views: 741

Answers (2)

kevmo314
kevmo314

Reputation: 4317

Selecting a pivot element is also complicated by the existence of integer overflow. If the boundary indices of the subarray being sorted are sufficiently large, the naive expression for the middle index, (left + right)/2, will cause overflow and provide an invalid pivot index. This can be overcome by using, for example, left + (right-left)/2 to index the middle element, at the cost of more complex arithmetic. Similar issues arise in some other methods of selecting the pivot element.

It should be (left + right) / 2, not (right - left) / 2. Wikipedia explains that the use of left + (right - left) / 2 is to avoid integer overflow.

Upvotes: 2

alexD
alexD

Reputation: 2364

you have midpoint = (right - left) / 2. I think you mean (right + left) / 2, otherwise your midpoint is outside of the given bounds.

Upvotes: 2

Related Questions