iKul
iKul

Reputation: 31

Skiena's Quick Sort implementation

I find it hard to understand Skiena's quick sort. Specifically, what he is doing with the partition function, especially the firsthigh parameter?

quicksort(item_type s[], int l, int h) {
    int p;                  /* index of partition */
    if ((h - l) > 0) {
            p = partition(s, l, h);
            quicksort(s, l, p-1);
            quicksort(s, p+1, h);
    }
}

We can partition the array in one linear scan for a particular pivot element by maintaining three sections of the array: less than the pivot (to the left of firsthigh), greater than or equal to the pivot (between firsthigh and i), and unexplored (to the right of i), as implemented below:

int partition(item_type s[], int l, int h) {

    int i;           /* counter */
    int p;           /* pivot element index */
    int firsthigh;   /* divider position for pivot element */

    p = h;
    firsthigh = l;
    for (i = l; i  <h; i++) {
        if (s[i] < s[p]) {
            swap(&s[i],&s[firsthigh]);
            firsthigh ++;
        }

    swap(&s[p],&s[firsthigh]);
    return(firsthigh);
}

Upvotes: 1

Views: 518

Answers (2)

j_random_hacker
j_random_hacker

Reputation: 51226

At all times, everything strictly to the left of firstHigh is known to be less than the pivot (notice that there are initially no elements in this set), and everything at or to the right of it is either unknown, or known to be >= the pivot. Think of firstHigh as the next place where we can put a value lower than the pivot.

This algorithm is very similar to the in-place algorithm you would use to delete all items that are >= the pivot while "compacting" the remaining items as far to the left as possible. For the latter, you would maintain two indices l and firstHigh (which you could think of as from and to, respectively) that both start at 0, and walk l through the array; whenever you encounter an s[l] that should not be removed, you shunt it as far left as possible: i.e., you copy it to s[firstHigh] and then you increment firstHigh. This is safe because we always have firstHigh <= l. The only difference here is that we can't afford to overwrite the deleted (possibly->=-to-pivot) item currently residing at s[firstHigh], so we swap the two items instead.

Upvotes: 0

Marco A.
Marco A.

Reputation: 43662

I recommend following the reasoning with pencil and paper while reading through this answer and its considered example case

Some parenthesis are missing from the snippet:

int partition(item_type s[], int l, int h)
{
  int i;/* counter */
  int p;/* pivot element index */
  int firsthigh;/* divider position for pivot element */
  p = h;
  firsthigh = l;
  for (i = l; i < h; i++) {

    if (s[i] < s[p]) {
      swap(s[i], s[firsthigh]);
      firsthigh++;
    }
  }

  swap(s[p], s[firsthigh]);
  return(firsthigh);
}

void quicksort(item_type s[], int l, int h)
{
  int p;                  /* index of partition */
  if ((h - l)>0) {
    p = partition(s, l, h);
    quicksort(s, l, p - 1);
    quicksort(s, p + 1, h);
  }
}

Anyway the partition function works as follows: suppose we have the array { 2,4,5,1,3 } of size 5. The algorithm grabs the last element 3 as the pivot and starts exploring the items iteratively:

2 is first encountered.. since 2 is less than the pivot element 3, it is swapped with the position 0 pointed by firsthigh. This has no effect since 2 is already at position 0

2,4,5,1,3
^

firsthigh is incremented since 2 is now a stable value at that position.

Then 4 is encountered. This time 4 is greater than 3 (than the pivot) so no swap is necessary. Notice that firsthigh continues pointing to 4. The same happens for 5.

When 1 is encountered, this value should be put after 2, therefore it is swapped with the position pointed by firsthigh, i.e. with 4's position

2,4,5,1,3
  ^   ^ swap
2,1,5,4,3
    ^ now firsthigh points here

When the elements end, the pivot element is swapped with firsthigh's position and therefore we get

2,1,| 3,| 4,5

notice how the values less than the pivot are put on the left while the values greater than the pivot remain on the right. Exactly what is expected by a partition function.

The position of the pivot element is returned and the process is repeated on the subarrays on the left and right of the pivot until a group of 0 elements is encountered (the if condition is the bottom of the recursion).

Therefore firsthigh means: the first element greater than the pivot that I know of. In the example above firsthigh is put on the first element since we still don't know if that element is greater or less than the pivot

2,4,5,1,3
^

as soon as we realize 2 is not the first element greater than the pivot or we swap a less-than-the-pivot element in that position, we try to keep our invariant valid: ok, advance firsthigh and consider 4 as the first element greater than the pivot. This gives us the three sections cited in the textbook.

Upvotes: 1

Related Questions