Reputation: 744
I recently saw this post about endianness macros in C and I can't really wrap my head around the first answer.
Code supporting arbitrary byte orders, ready to be put into a file called order32.h:
#ifndef ORDER32_H
#define ORDER32_H
#include <limits.h>
#include <stdint.h>
#if CHAR_BIT != 8
#error "unsupported char size"
#endif
enum
{
O32_LITTLE_ENDIAN = 0x03020100ul,
O32_BIG_ENDIAN = 0x00010203ul,
O32_PDP_ENDIAN = 0x01000302ul
};
static const union { unsigned char bytes[4]; uint32_t value; } o32_host_order =
{ { 0, 1, 2, 3 } };
#define O32_HOST_ORDER (o32_host_order.value)
#endif
You would check for little endian systems via
O32_HOST_ORDER == O32_LITTLE_ENDIAN
I do understand endianness in general. This is how I understand the code:
What I don't understand are the following aspects:
uint32_t
guaranteed to be able to hold 32 bits/4 bytes as needed? And what does the assignment { { 0, 1, 2, 3 } }
mean? It assigns the value to the union, but why the strange markup with two braces? CHAR_BIT
? One comment mentions that it would be more useful to check UINT8_MAX
? Why is char
even used here, when it's not guaranteed to be 8 bits wide? Why not just use uint8_t
? I found this link to Google-Devs github. They don't rely on this check... Could someone please elaborate?Upvotes: 3
Views: 1592
Reputation: 146053
Why is a union needed to store the test case?
The entire point of the test is to alias the array with the magic value the array will create.
Isn't uint32_t
guaranteed to be able to hold 32 bits/4 bytes as needed?
Well, more-or-less. It will but other than 32 bits there are no guarantees. It would fail only on some really fringe architecture you will never encounter.
And what does the assignment { { 0, 1, 2, 3 } }
mean? It assigns the value to the union, but why the strange markup with two braces?
The inner brace is for the array.
Why the check for CHAR_BIT?
Because that's the actual guarantee. If that doesn't blow up, everything will work.
One comment mentions that it would be more useful to check UINT8_MAX?
Why is char even used here, when it's not guaranteed to be 8 bits wide?
Because in fact it always is, these days.
Why not just use uint8_t?
I found this link to Google-Devs github. They don't rely on this check... Could someone please elaborate?
Lots of other choices would work also.
Upvotes: 3
Reputation: 34145
{{0, 1, 2, 3}}
is the initializer for the union, which will result in bytes
component being filled with [0, 1, 2, 3]
.
Now, since the bytes
array and the uint32_t
occupy the same space, you can read the same value as a native 32-bit integer. The value of that integer shows you how the array was shuffled - which really means which endian system are you using.
There are only 3 popular possibilities here - O32_LITTLE_ENDIAN
, O32_BIG_ENDIAN
, and O32_PDP_ENDIAN
.
As for the char
/ uint8_t
- I don't know. I think it makes more sense to just use uint_8
with no checks.
Upvotes: 2
Reputation: 223689
The initialization has two set of braces because the inner braces initialize the bytes
array. So byte[0]
is 0, byte[1]
is 1, etc.
The union allows a uint32_t
to lie on the same bytes as the char
array and be interpreted in whatever the machine's endianness is. So if the machine is little endian, 0
is in the low order byte and 3
is in the high order byte of value
. Conversely, if the machine is big endian, 0
is in the high order byte and 3
is in the low order byte of value
.
Upvotes: 2