ereOn
ereOn

Reputation: 55786

Is using an union in place of a cast well defined?

I had a discussion this morning with a colleague regarding the correctness of a "coding trick" to detect endianness.

The trick was:

bool is_big_endian()
{
  union
  {
    int i;
    char c[sizeof(int)];
  } foo;


  foo.i = 1;
  return (foo.c[0] == 1);
}

To me, it seems that this usage of an union is incorrect because setting one member of the union and reading another is not well-defined. But I have to admit that this is just a feeling and I lack actual proofs to strengthen my point.

Is this trick correct ? Who is right here ?

Upvotes: 16

Views: 3140

Answers (5)

moooeeeep
moooeeeep

Reputation: 32542

Don't do this, better use something like the following:

#include <arpa/inet.h>
//#include <winsock2.h> // <-- for Windows use this instead

#include <stdint.h>

bool is_big_endian() {
  uint32_t i = 1;
  return i == htonl(i);
}

Explanation:

The htonl function converts a u_long from host to TCP/IP network byte order (which is big-endian).


References:

Upvotes: 7

James Kanze
James Kanze

Reputation: 154017

The code has undefined behavior, although some (most?) compilers will define it, at least in limited cases.

The intent of the standard is that reinterpret_cast be used for this. This intent isn't well expressed, however, since the standard can't really define the behavior; there is no desire to define it when the hardware won't support it (e.g. because of alignment issues). And it's also clear that you can't just reinterpret_cast between two arbitrary types and expect it to work.

From a quality of implementation point of view, I would expect both the union trick and reinterpret_cast to work, if the union or the reinterpret_cast is in the same functional block; the union should work as long as the compiler can see that the ultimate type is a union (although I've used compilers where this wasn't the case).

Upvotes: 0

ildjarn
ildjarn

Reputation: 62995

You're correct that that code doesn't have well-defined behavior. Here's how to do it portably:

#include <cstring>

bool is_big_endian()
{
    static unsigned const i = 1u;
    char c[sizeof(unsigned)] = { };
    std::memcpy(c, &i, sizeof(c));
    return !c[0];
}

// or, alternatively

bool is_big_endian()
{
    static unsigned const i = 1u;
    return !*static_cast<char const*>(static_cast<void const*>(&i));
}

Upvotes: 2

Prasoon Saurav
Prasoon Saurav

Reputation: 92884

Your code is not portable. It might work on some compilers or it might not.

You are right about the behaviour being undefined when you try to access the inactive member of the union [as it is in the case of the code given]

$9.5/1

In a union, at most one of the data members can be active at any time, that is, the value of at most one of the data members can be stored in a union at any time.

So foo.c[0] == 1 is incorrect because c is not active at that moment. Feel free to correct me if you think I am wrong.

Upvotes: 13

excray
excray

Reputation: 2858

The function should be named is_little_endian. I think you can use this union trick. Or also a cast to char.

Upvotes: 0

Related Questions