Isaac
Isaac

Reputation: 10834

What is *(uint32_t *) &buffer[index]?

In some supposed-C++ code I found, I have buffer defined as const void *buffer; (it's arbitrary binary data that, I think, gets interpreted as a stream of 32-bit unsigned integers) and in many places, I have

*(uint32_t *) &buffer[index]

where index is some kind of integer (I think it was long or unsigned long and got swept up in my replacing those with int32_t and uint32_t when I was making the code work on a 64-bit system).

I recognize that this is taking the address of buffer (&buffer), casting it as a pointer to a uint32_t, and dereferencing that, at least based on this question... but then I'm confused by how the [index] part interacts with that or where I missed inserting the [index] part in between the steps I listed.

What, conceptually, is this doing? Is there some way I could define another variable to be a better type, with the casting there once, and then use that, rather than having this complicated expression throughout the code? Is this actually C++ or is this C99?

edit: The first couple of lines of the code are:

const void *buffer = data.bytes;
if (ntohl(*(int32_t *) buffer) != 'ttcf') {
    return;
}
uint32_t ttf_count = ntohl(*(uint32_t *) &buffer[0x08]);

where data.bytes has type const void *. Before I was getting buffer from data.bytes, it was char *.

edit 2: Apparently, having const void *buffer work is not normal C (though it absolutely works in my situation), so if it makes more sense, assume it's const char *buffer.

Upvotes: 1

Views: 3090

Answers (2)

M.M
M.M

Reputation: 141628

Lots of issues here! First of all, a void * cannot be dereferenced. buffer[index] is illegal in ISO C, although some compilers apparently have an extension that will treat it as (void)((char *)buffer)[index].

You suggest in comments that the code originally used char * - I recommend you leave it that way. Assuming buffer returns to being const char *:

if (ntohl(*(int32_t *) buffer) != 'ttcf') { return; }

The intent here is to pretend that the first four bytes of buffer contain an integer; read that integer, and compare it to 'ttcf'. The latter is a multibyte character constant, the behaviour of which is implementation-defined. It could represent four characters 't', 't', 'c', 'f', or 'f', 'c', 't', 't', or in fact anything else at all of type int.

A greater problem is that pretending a buffer contains an int when it did not actually get written via an expression of type int violates the strict aliasing rule. This is unfortunately a common technique in older code, but even since the first C standard it has caused undefined behaviour. If you use a compiler that performs type-based aliasing optimization it could wreck your code.

A way to write this code avoiding both of those problems is:

if ( memcmp(buffer, "ttcf", 4) ) { return; }

The later line uint32_t ttf_count = ntohl(*(uint32_t *) &buffer[0x08]); has similar issues. In this case there is no doubt that the best fix is:

uint32_t ttf_count; 
memcpy(&ttf_count, buffer + 0x08, sizeof ttf_count);
ttf_count = ntohl(ttf_count);

As discussed in comments, you could make an inline function to keep this tidy. In my own code I do something like:

static inline uint32_t be_to_uint32(void const *ptr)
{
    unsigned char const *p = ptr;
    return p[0] * 0x1000000ul + p[1] * 0x10000ul + p[2] * 0x100 + p[3];
}

and a similar version le_to_uint32 that reads bytes in the opposite order; then I use whichever of those corresponds to the input format instead of using ntohl.

Upvotes: 2

dbush
dbush

Reputation: 224387

Putting parenthesis in place to make the order of operations more explicit:

*((uint32_t *) &(buffer[index]))

So you're treating buffer as an array, however because buffer is a void * you can't dereference it directly.

Assuming you want to treat this buffer as an array of uint32_t, what you want to do is this:

((uint32_t *)buffer)[index]

Which can also be written as:

*((uint32_t *)buffer + index)

EDIT:

If index is the byte offset in the buffer, that changes things. In that case, I'd recommend defining the buffer as const char * instead of const void *. That way, you can be sure the dereferencing of the array is working properly.

So to break down the expression:

*(uint32_t *) &buffer[index]

You're going index bytes into buffer: buffer[index]

Then taking the address of that byte: &buffer[index]

Then casting that address to a uint32_t: (uint32_t *) &buffer[index]

Then dereferencing the uint32_t value: *(uint32_t *) &buffer[index]

Upvotes: 4

Related Questions