Reputation: 10834
In some supposed-C++ code I found, I have buffer
defined as const void *buffer;
(it's arbitrary binary data that, I think, gets interpreted as a stream of 32-bit unsigned integers) and in many places, I have
*(uint32_t *) &buffer[index]
where index
is some kind of integer (I think it was long
or unsigned long
and got swept up in my replacing those with int32_t
and uint32_t
when I was making the code work on a 64-bit system).
I recognize that this is taking the address of buffer
(&buffer
), casting it as a pointer to a uint32_t
, and dereferencing that, at least based on this question... but then I'm confused by how the [index]
part interacts with that or where I missed inserting the [index]
part in between the steps I listed.
What, conceptually, is this doing? Is there some way I could define another variable to be a better type, with the casting there once, and then use that, rather than having this complicated expression throughout the code? Is this actually C++ or is this C99?
edit: The first couple of lines of the code are:
const void *buffer = data.bytes;
if (ntohl(*(int32_t *) buffer) != 'ttcf') {
return;
}
uint32_t ttf_count = ntohl(*(uint32_t *) &buffer[0x08]);
where data.bytes
has type const void *
. Before I was getting buffer
from data.bytes
, it was char *
.
edit 2: Apparently, having const void *buffer
work is not normal C (though it absolutely works in my situation), so if it makes more sense, assume it's const char *buffer
.
Upvotes: 1
Views: 3090
Reputation: 141628
Lots of issues here! First of all, a void *
cannot be dereferenced. buffer[index]
is illegal in ISO C, although some compilers apparently have an extension that will treat it as (void)((char *)buffer)[index]
.
You suggest in comments that the code originally used char *
- I recommend you leave it that way. Assuming buffer
returns to being const char *
:
if (ntohl(*(int32_t *) buffer) != 'ttcf') { return; }
The intent here is to pretend that the first four bytes of buffer
contain an integer; read that integer, and compare it to 'ttcf'
. The latter is a multibyte character constant, the behaviour of which is implementation-defined. It could represent four characters 't', 't', 'c', 'f'
, or 'f', 'c', 't', 't'
, or in fact anything else at all of type int
.
A greater problem is that pretending a buffer contains an int
when it did not actually get written via an expression of type int
violates the strict aliasing rule. This is unfortunately a common technique in older code, but even since the first C standard it has caused undefined behaviour. If you use a compiler that performs type-based aliasing optimization it could wreck your code.
A way to write this code avoiding both of those problems is:
if ( memcmp(buffer, "ttcf", 4) ) { return; }
The later line uint32_t ttf_count = ntohl(*(uint32_t *) &buffer[0x08]);
has similar issues. In this case there is no doubt that the best fix is:
uint32_t ttf_count;
memcpy(&ttf_count, buffer + 0x08, sizeof ttf_count);
ttf_count = ntohl(ttf_count);
As discussed in comments, you could make an inline function to keep this tidy. In my own code I do something like:
static inline uint32_t be_to_uint32(void const *ptr)
{
unsigned char const *p = ptr;
return p[0] * 0x1000000ul + p[1] * 0x10000ul + p[2] * 0x100 + p[3];
}
and a similar version le_to_uint32
that reads bytes in the opposite order; then I use whichever of those corresponds to the input format instead of using ntohl
.
Upvotes: 2
Reputation: 224387
Putting parenthesis in place to make the order of operations more explicit:
*((uint32_t *) &(buffer[index]))
So you're treating buffer
as an array, however because buffer
is a void *
you can't dereference it directly.
Assuming you want to treat this buffer as an array of uint32_t
, what you want to do is this:
((uint32_t *)buffer)[index]
Which can also be written as:
*((uint32_t *)buffer + index)
EDIT:
If index
is the byte offset in the buffer, that changes things. In that case, I'd recommend defining the buffer as const char *
instead of const void *
. That way, you can be sure the dereferencing of the array is working properly.
So to break down the expression:
*(uint32_t *) &buffer[index]
You're going index
bytes into buffer
: buffer[index]
Then taking the address of that byte: &buffer[index]
Then casting that address to a uint32_t
: (uint32_t *) &buffer[index]
Then dereferencing the uint32_t
value: *(uint32_t *) &buffer[index]
Upvotes: 4