Reputation: 13
I am trying to implement quicksort in C sorting a char array. I am using the recursive approach which can also be found on wikipedia. Here's the relevant code:
#define TSIZE 32
#define TNUM 100000
Swap macro:
#define SWAP(a, b, size) \
do \
{ \
size_t __size = (size); \
char *__a = (a), *__b = (b); \
do \
{ \
char __tmp = *__a; \
*__a++ = *__b; \
*__b++ = __tmp; \
} while (--__size > 0); \
} while (0)
This swap macro is taken from the qsort function that is already implemented in C.
The char array:
char TWEETS[TNUM * TSIZE];
I am omitting the part where I read in all the data into the char array, because the setup I have here works perfectly fine with the qsort() function. Here however is an excerpt of how the data that is being read in looks like:
0 0 Feb 09 @RafinhaCosgrove OIE VOLTEI <33
0 1 Feb 19 @Nha_RDO tanyak ajah , taik idup dia!
0 2 Mar 08 @w0nderlxss No Hi's Mine
The initial quicksort call:
quicksort(TWEETS, 0, TNUM-1, compare);
The compare function:
int compare(const void* ptr1, const void* ptr2) {
int i;
unsigned char * t1 = (unsigned char *) ptr1;
unsigned char * t2 = (unsigned char *) ptr2;
for (i = 6; i < TSIZE; i++) {
if (t1[i] > t2[i])
return -1;
if (t2[i] > t1[i])
return 1;
}
return 0;
}
The quicksort function:
void quicksort(void* base, int left, int right, int (*compar)(const void* , const void*))
{
if(left < right) {
int pivot = partition(base, left, right, compar);
quicksort(base, left, pivot-1, compar);
quicksort(base, pivot+1, right, compar);
}
}
And finally the partition function:
int partition(void* arr, int left, int right, int (*compar)(const void* , const void*))
{
char* cArr = (char*) arr;
int i;
int pivotIndex = (left+right)/2;
char* pivotValue = &cArr[TSIZE * pivotIndex];
int index = left;
SWAP(&cArr[TSIZE * pivotIndex], &cArr[TSIZE * right], TSIZE);
for(i = left; i < right; i++) {
if(compar((void*) &cArr[TSIZE * i], (void*) pivotValue) < 0) {
SWAP(&cArr[TSIZE * i], &cArr[TSIZE * index], TSIZE);
index++;
}
}
SWAP(&cArr[TSIZE * index], &cArr[TSIZE * right], TSIZE);
return index;
}
Now there are a few things that you should be aware of:
1) The code setup works when using an int array and just a few numbers to sort instead of the char array.
2) The code setup (reading in the data etc.) works when just using the qsort() function. I am also using the result of this as a comparison for the output of my own implementation.
3) Since it works with qsort(), the comparison function should not be at fault.
4) Since it works with qsort(), and the swap macro is taken from the qsort() implementation, it should not be at fault either.
For the sake of completeness, here's the relevant code parts when using an int array instead of a char array (which is, once again, working).
Call in main function:
int array[15] = {9, 6, 2, 0, 3, 1, 4, 8, 5, 7, 7, 50, 132, 12, 45};
quicksort(array, 0, 14, compare);
Swap & Partition functions:
void swap(int* a, int* b)
{
int temp = *a;
*a = *b;
*b = temp;
}
int partition(int* arr, int left, int right, int (*compar)(const void* , const void*))
{
int i;
int pivotIndex = (left+right)/2;
int pivotValue = arr[pivotIndex];
int index = left;
swap(&arr[pivotIndex], &arr[right]);
for(i = left; i < right; i++) {
if(arr[i] < pivotValue) {
swap(&arr[i], &arr[index]);
index++;
}
}
swap(&arr[index], &arr[right]);
return index;
}
As you might see I am totally at a loss here. Any help is greatly appreciated.
Upvotes: 1
Views: 2242
Reputation:
Somewhat to my surprise the partition function appears to work. At any time the items below index
are less than the pivot-value and all items between index
and i
are greater than or equal to the pivot-value. This is very cunning and deceptively simple. There are two issues with it:
when index
and i
are the same, and the value is < pivot value, then it does a swap of itself to itself. If the swap is expensive, then a test for "self swap" may save a little.
when it swaps, it is moving the value at the index
forwards to i
, and stepping both. If the index
arrives at the value, it may have to swap it forwards again. So, it's possibly doing extra swaps to shuffle values along ahead of the advancing `index'.
Consider five values: 9 3 5 1 4
-- 5 will be chosen as the pivot-value, and swapped with the 4, giving 9 3 4 1 5
. Starts with index == 0
and i == 0
. The 9 is greater than the pivot, so leave index
and advance i
. Now 3 is less than the pivot, so we swap 9 and 3 and advance both index
and i
, giving 3 9 4 1 5
. Now 4 is less than the pivot, so swap again, shuffling the 9 forwards, giving 3 4 9 1 5
. And 1 is also less than the pivot, so swap again, giving 3 4 1 9 5
. Swapping the pivot value into place completes the process to give 3 4 1 5 9
.
So, this does 3 swaps to shuffle the 9 along, where only one swap is required.
A common way of doing the partition works from the left and from the right, looking for values that need to be swapped up and down, so that the process is done in the minimum number of swaps.
I tried this for vectors of 50 integers, each value selected at random from 1..50. I ran the 'Wikipedia' partition and a more 'Traditional' one, and counted the swaps over 20,000 trials. I got:
Average 'trad' swaps = 9.8
Average 'wiki' swaps = 23.0
Average 'wiki' selfs = 2.9
Average 'wiki' extra = 13.0
where 'trad' minimises the number of swaps. The "selfs" is swaps where index
== i
, and extras
are where a value is swapped forward more than once -- unsurprisingly the 'wiki' swaps less the 'wiki' extras is about the same as the 'trad' swaps.
With all due respect to Messrs Wikipedia and Sons, their partition function blows chunks.
Upvotes: 0
Reputation: 4530
In partition()
, change
char* pivotValue = &cArr[TSIZE * pivotIndex];
to
char* pivotValue = &cArr[TSIZE * right];
since after
SWAP(&cArr[TSIZE * pivotIndex], &cArr[TSIZE * right], TSIZE);
the pivot element is located at right
and no longer at pivotIndex
.
And your algorithm is sorting backwards. Change the sign of compare()
if that's not what you wanted.
Upvotes: 1
Reputation: 7118
I found the program is behaving as coded. In partition call, there was one problem of type casting, need explicit casting from void * to int *, rest of the program works with your sample data. The output is: 0 1 2 3 4 5 6 7 7 8 9 12 45 50 132
#include <iostream>
#include<vector>
using namespace std;
int compare(const int* ptr1, const int* ptr2) {
if (*ptr1 > *ptr2)
return -1;
else if (*ptr1 > *ptr2)
return 1;
return 0;
}
void swap(int* a, int* b)
{
int temp = *a;
*a = *b;
*b = temp;
}
int partition(int* arr, int left, int right, int (*compar)(const int* , const int*))
{
int i;
int pivotIndex = (left+right)/2;
int pivotValue = arr[pivotIndex];
int index = left;
swap(&arr[pivotIndex], &arr[right]);
for(i = left; i < right; i++) {
if(arr[i] < pivotValue) {
swap(&arr[i], &arr[index]);
index++;
}
}
swap(&arr[index], &arr[right]);
return index;
}
void quicksort(void* base, int left, int right, int (*compar)(const int* , const int*))
{
if(left < right) {
int pivot = partition((int *)base, left, right, compar);
quicksort(base, left, pivot-1, compar);
quicksort(base, pivot+1, right, compar);
}
}
int main( )
{
int array[15] = {9, 6, 2, 0, 3, 1, 4, 8, 5, 7, 7, 50, 132, 12, 45};
quicksort(array, 0, 14, compare);
for (int i = 0; i < 15; i++) {
cout << array[i] << " " ;
}
cout << endl;
return 0;
}
Upvotes: 0