SolskGaer
SolskGaer

Reputation: 81

Is there any restriction about the compare function used in qsort()

I wrote this code snippet to sort an array of strings into an order that minimises their concatenation:

#include<stdio.h>
#include<string.h>
#include<stdlib.h>

int cmpstr(const void* p1, const void* p2){
    int p1l = strlen((const char*)p1);
    int p2l = strlen((const char*)p2);
    int r = strncmp((const char*)p1, (const char*)p2, p1l<p2l?p1l:p2l);
    if(r == 0 && p1l != p2l){
        if(p1l < p2l){
            return cmpstr(p1, (const char*)p2 + p1l);
        }
        return cmpstr((const char*)p1 + p2l, p2);
    }
    return r;
}

int main(){
    const char* arrstr[] = {"93", "936", "15", "152", "946"};
    int num = sizeof(arrstr) / sizeof(char*);
    qsort(arrstr, num, sizeof(char*), cmpstr);
    for(int i = 0; i < num; i++){
        printf("%s\n", arrstr[i]);
    }
}

These strings should sort in the order 15 152 936 93 946. We want 93 to be between 936 and 946 because 936 93 < 93 936 and 93 946 < 946 93 (ignoring the spaces added for clarity).

But the code didn't work as expected. The array didn't get sorted at all, although my tests of cmpstr() worked exactly as I expected.

What did I get wrong?

I noticed that when I change the cast part of the cmpstr() from *(char* const*) to (char*), qsort() doesn't work as well. Why is that?

Upvotes: 0

Views: 310

Answers (2)

Toby Speight
Toby Speight

Reputation: 30728

Your comparison function receives pointers to the elements of the array. Each element is a pointer to char, so you'll get a pointer to a pointer to char.

The comparison logic is also somewhat over-complicated; here's a working version:

#include<stdio.h>
#include<string.h>
#include<stdlib.h>

int cmpstr(const void* a_, const void* b_)
{
    const char *const *a = a_;
    const char *const *b = b_;
    int la = strlen(*a);
    int lb = strlen(*b);

    if (la == lb) {
        /* same length - sort lexicographically */
        return strcmp(*a, *b);
    }

    if (la < lb) {
        /* a is shorter */
        int result = strncmp(*a, *b, la);
        if (!result) {
            /* a is a prefix of b */
            result = strcmp(*a, *b + la);
        }
        return result;
    }

    /* else, b is shorter - re-enter with arguments swapped,
       and negate the result */
    return -cmpstr(b_, a_);
}

int main() {
    const char* arrstr[] = {"93", "936", "15", "152", "946"};
    const size_t num = sizeof arrstr / sizeof *arrstr;
    qsort(arrstr, num, sizeof *arrstr, cmpstr);
    for (size_t i = 0; i < num; i++) {
        printf("%s\n", arrstr[i]);
    }
}

Output:

15
152
936
93
946

If you think that my cmpstr() above deviates too far from the original, consider this less intrusively-modified code, which uses the recursive comparison that you intended, with a separate wrapper to adapt it to fit qsort():

int compare_strings(const char *a, const char *b)
{
    int la = strlen(a);
    int lb = strlen(b);
    int r = strncmp(a, b, la<lb?la:lb);
    if (r == 0 && la != lb) {
        if (la < lb) {
            return compare_strings(a, b + la);
        }
        return compare_strings(a + lb, b);
    }
    return r;
}

int compare_strings_qsort(const void* a_, const void* b_)
{
    const char *const *a = a_;
    const char *const *b = b_;
    return compare_strings(*a, *b);
}

I still had to change your variable names, as I find p1l and the like hard to read. I could simplify a bit further, which I think is clearer than both the original function and my first attempt above (but probably needs some comments):

int compare_strings(const char *a, const char *b)
{
    const int la = strlen(a);
    const int lb = strlen(b);
    const int r = strncmp(a, b, la<lb?la:lb);

    return (la == lb || r)
        ? r
        : (la < lb)
        ? compare_strings(a, b + la)
        : compare_strings(a + lb, b);
}

Upvotes: 2

dbush
dbush

Reputation: 223699

The comparison function passed to qsort receives the address of the two array elements to compare. Since each array element is a char *, the address of each of these elements is a char **. So you're missing one level of indirection.

You need to cast each parameter to a char * const *, then dereference to get the pointer to the string:

int cmpstr(const void* p1p, const void* p2p){
    char *p1 = *(char * const *)p1p;
    char *p2 = *(char * const *)p2p;
    ...
}

EDIT:

Because you want to call this function recursively, you need a non-recursive wrapper function around your recursive function since the parameters they take are not the same:

// internal recursive function that takes two strings
static int cmpstr_int(const char* p1, const char* p2){
    int p1l = strlen(p1);
    int p2l = strlen(p2);
    int r = strncmp(p1, p2, p1l<p2l?p1l:p2l);
    if(r == 0 && p1l != p2l){
        if(p1l < p2l){
            return cmpstr_int(p1, p2 + p1l);
        }
        return cmpstr_int(p1 + p2l, p2);
    }
    return r;
}

// comparison function that extracts desired datatype from void * params
// and passes them to recursive function
static int cmpstr(const void* p1p, const void* p2p){
    const char *p1 = *(char * const *)p1p;
    const char *p2 = *(char * const *)p2p;
    return cmpstr_int(p1, p2);
}

Upvotes: 3

Related Questions