Reputation: 1153
I know violating the strict-aliasing rule is Undefined Behavior as per the C standard. Please don't tell me it is UB and there is nothing to talk about.
I'd like to know if there are compilers which won't have the expected behavior (defined by me below) for the following code.
Assume the size of float
and int
is 4 bytes, and a big-endian machine.
float f = 1234.567; /* Any value here */
unsigned int u = *(unsigned int *)&f;
My expected behavior in english words is "get the four bytes where the float
is stored and put them in an int
as is". In code it would be this (I think there is no UB here):
float f = 1234.567; /* Any value here */
unsigned char *p = (unsigned char *)&f;
unsigned int u = (p[0] << 24) | (p[1] << 16) | (p[2] << 8) | p[3];
I'd also welcome practical and concrete examples of why, apart from being UB as per the standard, a compiler would have what I consider an unexpected behavior.
Upvotes: 0
Views: 113
Reputation: 98505
You're invoking undefined behavior for no reason at all.
Will this strict-aliasing rule violation have the behavior I expect?
No. And you don't need to expect anything, because you can write much better looking code.
This has defined behavior that you'd like:
union {
float f;
uint32_t i;
} ufi_t;
assert(sizeof(float) == sizeof(uint32_t);
ufi_t u = { 123.456 };
uint32_t i = u.i;
You can factor it out, decent compilers will generate no code for it:
inline uint32_t int_from_float(float f) {
ufi_t u = { f };
return u.i;
}
You can also cast from (*float) to (*ufi_t) safely. So:
float f = 123.456;
uint32_t i = ((ufi_t*)&f)->i;
Note: language lawyers are welcome to set me straight on this last one, but that's what I make of C9899:201x 6.5 and so on.
Upvotes: 0
Reputation: 263617
float f = 1234.567; /* Any value here */
unsigned int u = *(unsigned int *)&f;
Some plausible reasons why this would not work as expected are:
float
and unsigned int
are not the same size. (I've worked on systems where int
is 64 bits and float
is 32 bits. I've also worked on systems where both int
and float
are 64 bits, so your assumption that 4 bytes are copied would fail.)
float
and unsigned int
have different alignment requirements. Specifically, if unsigned int
requires stricter alignment than float
, and f
happens to to be strictly aligned, reading f
as if it were an unsigned int
could do Bad Things. (This is probably unlikely if int
and float
are the same size.)
A compiler might recognize that the code's behavior is undefined, and for example optimize away the assignment. (I don't have a concrete example of this.)
If you want to copy the representation of a float
into an unsigned int
, memcpy()
is safer (and I'd first check that they actually have the same size). If you want to examine the representation of a float
object, the canonical way to do that is to copy it to an array of unsigned char
. Quoting the ISO C standard (6.2.6.1p4 in the N1570 draft):
Values stored in non-bit-field objects of any other object type consist of n ×
CHAR_BIT
bits, where n is the size of an object of that type, in bytes. The value may be copied into an object of typeunsigned char [
n]
(e.g., bymemcpy
); the resulting set of bytes is called the object representation of the value.
Upvotes: 1
Reputation: 126478
On most compilers, it will do what you expect until the optimizer decides to dead-code eliminate or move the assignment to f.
This makes it essentially impossible to test if any given compiler will always do what you expect -- it might work for one particular program, but then a slightly different one might fail. The strict-aliasing rule is basically just telling the compiler implementor "you can rearrange and eliminate these things fairly freely by assuming they never alias". When it's not useful to do things that would cause this code to fail, the optimizer probably won't, so you won't see a problem.
Bottom line is that it is not useful to talk about "which compilers this will somtimes work on", as it might suddenly stop working in the future on any of them if something seemingly unrelated changes.
Upvotes: 8