valdi.k
valdi.k

Reputation: 340

Quicksort - Hoare's partitioning with duplicate values

I have implemented the classic Hoare's partitioning algorithm for Quicksort. It works with any list of unique numbers [3, 5, 231, 43]. The only problem is when I have a list with duplicates [1, 57, 1, 34]. If I get duplicate values I enter an infinite loop.

private void quicksort(int[]a, int lo, int hi) {
    if (lo < hi) {
        int q = hoare_partition(a, lo, hi);
        quicksort(a, lo, q - 1);
        quicksort(a, q + 1, hi);
    }
}

private int hoare_partition(int[] a, int lo, int hi) {

    int pivot = a[hi];
    int i = lo;
    int j = hi;

    while (true) {

        while (a[i] < pivot) i++;
        while (a[j] > pivot) j--;

        if (i < j) swap(a, i, j);
        else return j;
    }
}

Is it possible that Hoare's implementation is incorrect and is unable to cope with duplicates?

I know this has been asked many times (and I tried them all) but I am having difficulties figuring the solution for this implementation.

Upvotes: 12

Views: 12956

Answers (6)

jake
jake

Reputation: 93

I know this post is old, but I believed I've figured out a solution. The problem arises when hoare_partition immediately finds that a[i] at i = lo needs to be swapped out of its position, but it never finds a suitable a[j] to swap with. We know this has occured if q == lo. This issue arises when the pivot is the smallest value in the array and is a duplicate. To address this, we check if q == lo in quicksort, and if so, we manually swap a[lo] with a[pivot_index] to make sure that a[lo] is moved out of its position and into a valid position. We only need to make one recursive call, since the left partition will be of size 1.

I also made slight modifications to the the conditions in the inner while loops. You need to check if i < j in the inner loops, in addition to the outer loop. The second inner while now checks if a[j] is >= pivot, rather than strictly >. The last recursive call uses q instead of q+1 as the new lo. I took pivot_index as a parameter. Lastly, I returned j instead of i.

private void quicksort(int[] a, int lo, int hi) {
  if (lo < hi) {
    // int pivot_index = (rand() % (hi - lo + 1)) + lo; // randomized pivot_index is better. should be in range [lo, hi] inclusive 
    int pivot_index = hi; 
    int q = hoare_partition(a, lo, hi, pivot_index);

    if (q == lo) {
      swap(a, lo, pivot_index);
      quicksort(a, lo + 1, hi);

    } else {
      quicksort(a, lo, q - 1);
      quicksort(a, q, hi);

     }
  }
}

private int hoare_partition(int[] a, int lo, int hi, int pivot_index) {
  int pivot = a[pivot_index];
  int i = lo;
  int j = hi;

  while (true) {
    
    while (i < j && a[i] < pivot) i++;
    while (i < j && a[j] >= pivot) j--;
    
    if (i < j) swap(a, i, j);
    else return j;
  }
}

Upvotes: 1

serg06
serg06

Reputation: 2695

Is it possible that Hoare's implementation is incorrect and is unable to cope with duplicates?

That's correct, Hoare's algorithm breaks when the pivot appears multiple times in the array.

One fix is to use a three-way partitioning algorithm. Instead of returning a single index, it returns two indices: Where the pivot section begins, and where it ends.

Upvotes: 0

rcgldr
rcgldr

Reputation: 28828

Here is a C++ example that implements Hoare scheme plus a median of 3 check, a duplicate of pivot check to exclude middle values equal to pivot (note that values equal to pivot can end up anywhere, not just the middle, so this doesn't help much), only using recursion on the smaller part, looping back for the larger part (this prevents stack overflow). Worst case time complexity is still O(n^2), but it takes fairly specific data patterns to produce this (median of 3 would have to consistently keep picking near smallest or near largest values).

void QuickSort(uint32_t a[], size_t lo, size_t hi) {
    while(lo < hi){
        size_t i = lo, j = (lo+hi)/2, k = hi;
        uint32_t p;
        if (a[k] < a[i])            // median of 3
            std::swap(a[k], a[i]);
        if (a[j] < a[i])
            std::swap(a[j], a[i]);
        if (a[k] < a[j])
            std::swap(a[k], a[j]);
        p = a[j];
        i--;                        // Hoare partition
        k++;
        while (1) {
            while (a[++i] < p);
            while (a[--k] > p);
            if (i >= k)
                break;
            std::swap(a[i], a[k]);
        }
        i = k++;
        while(i > lo && a[i] == p)  // exclude middle values == pivot
            i--;
        while(k < hi && a[k] == p)
            k++;
        // recurse on smaller part, loop on larger part
        if((i - lo) <= (hi - k)){
            QuickSort(a, lo, i);
            lo = k;
        } else {
            QuickSort(a, k, hi);
            hi = i;
        }
    }
}

Upvotes: 2

chau-phan
chau-phan

Reputation: 955

An address to @Terry Li's answer above as my reputation is not sufficient to comment. First of all, thank you very much for the detailed post. However, I believe there are some issues with the alternative function with while loop you provided. It doesn't return the index of the original pivot and it is inaccurate per se. (Please try with hoare_partition([6,7,8,9,1,2,3,4,5], 0, 8)) The problem is caused by incrementing i and decrementing j at the same time, causing you to lose track of the pivot. Hence in the proposed fix below, I have inserted a small condition to ensure whichever index storing the pivot is not changed. Please correct me if I'm wrong.

private static int hoare_partition(int[] a, int lo, int hi) {

            int pivot = a[lo];
            int i = lo;
            int j = hi;

            while (true) {

                while (a[i] < pivot) i++;

                while (a[j] > pivot) j--;

                if (i >= j) {
                    return i;
                }
                swap(a, i, j);
                if (a[j] == pivot)
                    i++;
                elif (a[i] == pivot)
                    j--;

            }
        }

Upvotes: 3

jbakirov
jbakirov

Reputation: 1056

Here is one more example with while loop

private static int partition(int[] a, int start, int end){
   int i = start-1, j = end+1;

   while(true){
      while(a[++i] < a[start]);
      while(a[start] < a[--j]);

      if (i >= j) return j;
      swap(a, i, j);
   }
}

Upvotes: 0

Terry Li
Terry Li

Reputation: 17268

algorithm partition(A, lo, hi) is
    pivot := A[lo]
    i := lo – 1
    j := hi + 1
    loop forever
        do
            i := i + 1
        while A[i] < pivot

        do
            j := j – 1
        while A[j] > pivot

        if i >= j then
            return j

        swap A[i] with A[j]

The pseudocode above is taken from Wikipedia. Let's compare it with your code.

The problem is that you have to move the indices after the swap. The pseudocode uses do-while loop instead of while loop to move the indices after the swap. Also pay attention to the initial values of i and j.

algorithm quicksort(A, lo, hi) is
    if lo < hi then
        p := partition(A, lo, hi)
        quicksort(A, lo, p)
        quicksort(A, p + 1, hi)

For the recursive step, you might need to take care of the edge cases (i.e., the indices). It should work if you change quicksort(a, lo, q-1) to quicksort(a, lo, q).

A complete working version I just wrote:

import java.util.Arrays;

public class Test {

    public static void main(String[] args) throws Exception {
        int[] input = {1, 57, 1, 34};
        quicksort(input, 0, input.length - 1);
        System.out.println(Arrays.toString(input));
    }

    private static void quicksort(int[]a, int lo, int hi) {
        if (lo < hi) {
            int q = hoare_partition(a, lo, hi);
            quicksort(a, lo, q);
            quicksort(a, q + 1, hi);
        }
    }

    private static int hoare_partition(int[] a, int lo, int hi) {

        int pivot = a[lo];
        int i = lo - 1;
        int j = hi + 1;

        while (true) {
            do {
                i++;
            }
            while (a[i] < pivot);

            do {
                j--;
            }
            while (a[j] > pivot);

            if (i >= j) {
                return j;
            }
            swap(a, i, j);

        }
    }

    private static void swap(int[] a, int i, int j) {
        int temp = a[i];
        a[i] = a[j];
        a[j] = temp;
    }

}

If you prefer the while loop instead of do-while:

private static int hoare_partition(int[] a, int lo, int hi) {

            int pivot = a[lo];
            int i = lo;
            int j = hi;

            while (true) {

                while (a[i] < pivot) i++;

                while (a[j] > pivot) j--;

                if (i >= j) {
                    return j;
                }
                swap(a, i, j);
                i++;
                j--;

            }
        }

Upvotes: 12

Related Questions