inquam
inquam

Reputation: 12942

Union with bitfield gives unexpected value to bitfield members

I have the following construct meant to take a 48bit value that contains four 12bit values and extract them.

struct foo {
    union {
        unsigned __int64 data;
        struct {
            unsigned int a      : 12;
            unsigned int b      : 12;
            unsigned int c      : 12;
            unsigned int d      : 12;
            unsigned int unused : 16;
        };
    };
} foo;

The data in question is then assigned using

foo.data = (unsigned __int64)Value;

Value here is initially a double used to store the data.

My assumptions when making a bit field are

Are these correct?

Testing with

Value = 206225551364

we get a Value that should contain the bits

0000 0011 0000 0000 0100 0000 0000 0011 0000 0000 0100‬

This should result in

a: 0000 0000 0100‬ = 4
b: 0000 0000 0011 = 3
c: 0000 0000 0100 = 4
d: 0000 0000 0011 = 3

But running this the actual returned values are

a: 4
b: 3
c: 48
d: 0

Although the values should fit within the unsigned int's switching around the types used sometimes changed the values. So it felt like it had something to do with how the data was interpreted when added to the bitfield.

By adding #pragma pack(1), which I understand has something to do with alignment but haven't come across very often, I all of a sudden get the expected values.

struct foo {
    union {
        unsigned __int64 data;
#pragma pack(1)
        struct {
            unsigned int a      : 12;
            unsigned int b      : 12;
            unsigned int c      : 12;
            unsigned int d      : 12;
            unsigned int unused : 16;
        };
    };
} foo;

a: 4
b: 3
c: 4
d: 3

But I don't feel comfortable just accepting this. I want to understand it and thus ensure it actually works and isn't just appearing to work while the values don't take up more than 4 bits for instance.

So,

Upvotes: 2

Views: 1268

Answers (2)

eerorika
eerorika

Reputation: 238461

why am I seeing the issue to begin with?

Firstly, accessing inactive member of union has undefined behaviour. But let us assume that your system allows it.

unsigned is probably 32 bits. a and b fit into the first unsigned taking a total of 24 bits. There is only 8 bits left of this unsigned. 12 bit c does not fit into this 8 bit slot. So, it instead starts a new unsigned leaving 8 bits of padding.

This is one possible outcome. Bit field layout is implementation defined. On another system you might see the outcome that you expected. Or output that is different from what you expect and different from what you observed here.

What does the #pragma pack statement do that fixes the issue?

It probably changes the layout rules to allow "straddling" of the bitfield across multiple underlying objects. This probably makes accessing it a bit slower.

Can one deduce when this will become a problem and not?

If you don't try to straddle the underlying objects, then there won't be a difference in whether the layout supports that. In this case, you could simply use a 64 bit underlying object.

This is not the only way the layout of bitfields might differ from what you expect though. Bitfields could be most significant first or last for example. The number of bits in unsigned itself is implementation defined.

In general, layout of bitsets is not something that should be relied upon.

How would, what I want to achieve best be done then?

To avoid UB, instead of punning through union, you can create the other object, and copy the bytes from one over the other. But first, you must make sure the object have the same size. The copying can be done with std::memcpy or std::bit_cast.

To avoid issues with straddling, use sets of bitfields that fill each underlying object completely. In this case by using a 64 bit underlying object.

To get reliable layout, don't use bitfields in the first place. bartop shows how to do this with shifts and masks. (although, the layout still relies on endianness)

Upvotes: 2

bartop
bartop

Reputation: 10315

Long story short - casting data via union is undefined behaviour, regardless of what you are doing. So it works and does not work just by accident. Only thing you are allowed to do with union is read the member you wrote to last time. You do anything other and your program is invalid.

EDIT:

And even if this was allowed, without #pragma pack you depend on data alignment within the struct. Which is probably is 32 or 64 bits. So in this case your struct really looks like this in memory:

struct {
    unsigned int a      : 12;
    unsigned int a_align: 20;
    unsigned int b      : 12;
    unsigned int b_align: 20;
    unsigned int c      : 12;
    unsigned int c_align: 20;
    unsigned int d      : 12;
    unsigned int d_align: 20;
    unsigned int unused : 16;
    unsigned int unused_align: 16;

};

If you want to extract some data from the struct you should probably use masking and bitshifts like this:

unsigned mask12 = 0xFFF;//1 on first 12 least significant bits
unsigned a = data & mask12;
unsigned b = (data >> 12) & mask12;
unsigned c = (data >> 24) & mask12;
unsigned d = (data >> 36) & mask12;

Upvotes: 2

Related Questions