atturri
atturri

Reputation: 1153

Will this strict-aliasing rule violation have the behavior I expect?

I know violating the strict-aliasing rule is Undefined Behavior as per the C standard. Please don't tell me it is UB and there is nothing to talk about.

I'd like to know if there are compilers which won't have the expected behavior (defined by me below) for the following code.

Assume the size of float and int is 4 bytes, and a big-endian machine.

float f = 1234.567;  /* Any value here */
unsigned int u = *(unsigned int *)&f;

My expected behavior in english words is "get the four bytes where the float is stored and put them in an int as is". In code it would be this (I think there is no UB here):

float f = 1234.567;  /* Any value here */
unsigned char *p = (unsigned char *)&f;
unsigned int u = (p[0] << 24) | (p[1] << 16) | (p[2] << 8) | p[3];

I'd also welcome practical and concrete examples of why, apart from being UB as per the standard, a compiler would have what I consider an unexpected behavior.

Upvotes: 0

Views: 113

Answers (3)

You're invoking undefined behavior for no reason at all.

Will this strict-aliasing rule violation have the behavior I expect?

No. And you don't need to expect anything, because you can write much better looking code.

This has defined behavior that you'd like:

union {
  float f;
  uint32_t i;
} ufi_t;
assert(sizeof(float) == sizeof(uint32_t);

ufi_t u = { 123.456 };
uint32_t i = u.i;

You can factor it out, decent compilers will generate no code for it:

inline uint32_t int_from_float(float f) {
  ufi_t u = { f };
  return u.i;
}

You can also cast from (*float) to (*ufi_t) safely. So:

float f = 123.456;
uint32_t i = ((ufi_t*)&f)->i;

Note: language lawyers are welcome to set me straight on this last one, but that's what I make of C9899:201x 6.5 and so on.

Upvotes: 0

Keith Thompson
Keith Thompson

Reputation: 263617

float f = 1234.567;  /* Any value here */
unsigned int u = *(unsigned int *)&f;

Some plausible reasons why this would not work as expected are:

  1. float and unsigned int are not the same size. (I've worked on systems where int is 64 bits and float is 32 bits. I've also worked on systems where both int and float are 64 bits, so your assumption that 4 bytes are copied would fail.)

  2. float and unsigned int have different alignment requirements. Specifically, if unsigned int requires stricter alignment than float, and f happens to to be strictly aligned, reading f as if it were an unsigned int could do Bad Things. (This is probably unlikely if int and float are the same size.)

  3. A compiler might recognize that the code's behavior is undefined, and for example optimize away the assignment. (I don't have a concrete example of this.)

If you want to copy the representation of a float into an unsigned int, memcpy() is safer (and I'd first check that they actually have the same size). If you want to examine the representation of a float object, the canonical way to do that is to copy it to an array of unsigned char. Quoting the ISO C standard (6.2.6.1p4 in the N1570 draft):

Values stored in non-bit-field objects of any other object type consist of n × CHAR_BIT bits, where n is the size of an object of that type, in bytes. The value may be copied into an object of type unsigned char [ n ] (e.g., by memcpy); the resulting set of bytes is called the object representation of the value.

Upvotes: 1

Chris Dodd
Chris Dodd

Reputation: 126478

On most compilers, it will do what you expect until the optimizer decides to dead-code eliminate or move the assignment to f.

This makes it essentially impossible to test if any given compiler will always do what you expect -- it might work for one particular program, but then a slightly different one might fail. The strict-aliasing rule is basically just telling the compiler implementor "you can rearrange and eliminate these things fairly freely by assuming they never alias". When it's not useful to do things that would cause this code to fail, the optimizer probably won't, so you won't see a problem.

Bottom line is that it is not useful to talk about "which compilers this will somtimes work on", as it might suddenly stop working in the future on any of them if something seemingly unrelated changes.

Upvotes: 8

Related Questions