Solruhama
Solruhama

Reputation: 111

Void pointer arithmetic and dereference

I was looking at a quick sort implementation for generic purpose and one of the parameters of the quick sort function was a void pointer, and I saw the ff arithmetics on void pointers, so I was wondering what that actually does and if it's even possible?

void _qsort(void *v, int size, int left, int right, int (*comp)(void *, void *));

void swap(void *v1, void *v2, int size) {
    char buffer[size];
    memcpy(buffer, v1, size);
    memcpy(v1, v2, size);
    memcpy(v2, buffer, size);
}

This part appears in the body of the quick sort defined above:

    void *vl = (char *)(v + (left * size));
    void *vr = (char *)(v + (mid * size));
    swap(vl, vr, size);

so from above code,

  1. How is it possible to do arithmetic on void pointer v, like v + (left * size)?
  2. What does void *vl = (char *)(v + (left * size)); part mean? isn't already casted to char pointer, if so why are we assigning it to a void pointer?
  3. In the swap part, what exactly is happening, like are the vl and vr changing their memory location, value or something else?

Upvotes: 2

Views: 194

Answers (1)

chqrlie
chqrlie

Reputation: 145297

You assume right: arithmetics on void pointers is not defined by the C Standard.

The program you are looking at uses a common compiler extension supported by gcc, clang, tcc and many others, that implements arithmetics on void pointers as if they were byte pointers. With this extension, v + (left * size) behaves as

    (void *)((char *)v + (left * size))

So the declaration void *vl = (char *)(v + (left * size)); is equivalent to:

    void *vl = (char *)((void *)((char *)v + (left * size)));

Note that the whole expression is cast implicitly to (void *) in C.

This declaration can be simplified as:

    void *vl = (char *)v + left * size;

This is probably what the programmer meant to write and their mistake went unreported because the compiler allows void * arithmetics with exactly the same effect.

Regarding your third question, the swap function exchanges the contents of the memory blocks pointed to by v1 and v2 using a local variable length array buffer of size bytes. qsort is usually called with a rather small element size, so this approach is OK, but calling qsort with an array of very long elements (more than a few megabytes) is allowed and could cause a stack overflow.

Here is a safer implementation:

void swap(void *v1, void *v2, size_t size) {
    unsigned char *p1 = v1;
    unsigned char *p2 = v2;
    while (size >= 8) {
        char buffer[8];
        memcpy(buffer, p1, sizeof buffer);
        memcpy(p1, p2, sizeof buffer);
        memcpy(p2, buffer, sizeof buffer);
        p1 += sizeof buffer;
        p2 += sizeof buffer;
        size -= sizeof buffer;
    }
    if (size > 0) {
        if (size >= 4) {
            char buffer[4];
            memcpy(buffer, p1, sizeof buffer);
            memcpy(p1, p2, sizeof buffer);
            memcpy(p2, buffer, sizeof buffer);
            p1 += sizeof buffer;
            p2 += sizeof buffer;
            size -= sizeof buffer;
        }
        while (size > 0) {
            unsigned char temp = *p1;
            *p1 = *p2;
            *p2 = temp;
            p1++;
            p2++;
            size--;
        }
    }
}

Also note these remarks:

  • The standard library function qsort has a different prototype:

      void qsort(void *base, size_t nmemb, size_t size,
                 int (*compar)(const void *, const void *));
    
  • Using int for sizes and index values is not recommended. left * size might overflow for a large array on systems with 64-bit size_t and 32-bit int: The _qsort function assumes that left * size and right * size are within the range of type size_t and at most the size of the array pointed to by v, yet this size might exceed INT_MAX, causing the int multiplication to overflow with undefined behavior. The correct type is size_t and a sanity check at the beginning of the function can be used to detect invalid arguments.

Upvotes: 4

Related Questions