Reputation: 63
I read this on a blog:
Violating Type Rules: It is undefined behavior to cast an int* to a float* and dereference it (accessing the "int" as if it were a "float"). C requires that these sorts of type conversions happen through memcpy: using pointer casts is not correct and undefined behavior results. The rules for this are quite nuanced and I don't want to go into the details here (there is an exception for char*, vectors have special properties, unions change things, etc). This behavior enables an analysis known as "Type-Based Alias Analysis" (TBAA) which is used by a broad range of memory access optimizations in the compiler, and can significantly improve performance of the generated code. For example, this rule allows clang to optimize this function:
How can you use the memcpy function for type coercion? And what about the exception to char*?
I don't understand how to use the memcpy function for type coercion?
Upvotes: 2
Views: 128
Reputation: 48010
Suppose you have the float
value 1.25. And suppose you want to confirm that its actual IEEE-754 representation in hexadecimal is 3fa00000
. There are at least four different ways you might try to do this:
(1) Take a float
pointer and cast it to an integer pointer, and indirect on it:
float f = 1.25;
printf("%08x\n", *(uint32_t *)&f);
(This fragment quietly assumes 32-bit int
. For better portability, you could use printf("%08" PRIx32 "\n", *(uint32_t *)&f);
.)
(2) Use a union:
union {float f; uint32_t i;} u;
u.f = f;
printf("%08x\n", u.i);
(3) Use a char
pointer, and iterate/index:
unsigned char *p = (unsigned char *)&f;
for(int i = 3; i >= 0; i--) printf("%02x", p[i]);
(Note that this code fragment assumes little-endian.)
(4) Use memcpy
:
uint32_t x;
memcpy(&x, &f, 4);
printf("%08x\n", x);
Now, the take-home lesson is that not all of these methods work reliably any more, because of the strict aliasing rule.
In particular, method (1) is flatly illegal. It's a textbook example of what the strict aliasing rule disallows.
I think you're still allowed to use a union as in method 2, but you may have to put on a language lawyer hat to convince yourself of it. (See also the comments on this answer below.)
Methods (3) and (4), however, continue to work, because they take advantage of an explicit exception to the strict aliasing rule, namely that you are allowed to access the bits of an object using a punned pointer of the "wrong" type, as long as the "wrong type" is specifically a character pointer.
So I think this is clear, but in answer to your specific questions:
How can you use the
memcpy
function for type coercion?
As in method (4).
And what about the exception to
char *
?
That's the explicit exception in the strict aliasing rule that allows method (3) to work.
The rules, by the way, are significantly different here in C than in C++. Strictly speaking, I believe, in C++ not even method (3) is legal, and the only way you're allowed to do this sort of thing any more is with method (4) and an implicit call to memcpy
. (However, I'm told that optimizing compilers tend to treat calls to memcpy
very specially these days, not only replacing explicit function calls with inline register moves, but sometimes even optimizing out the copy altogether, and doing something like method 1 or 2 internally, if they know they can get away with it.)
Upvotes: 4