Rongxuan He
Rongxuan He

Reputation: 1

C union assignment with big endian

#include <stdio.h>

union data{
    int n;
    char ch;
    short m;
};

int main(){
    union data a;
    printf("%d, %d\\n", sizeof(a), sizeof(union data) );
    a.n = 0x40;
    printf("%X, %c, %hX\\n", a.n, a.ch, a.m);
    a.ch = '9';
    printf("%X, %c, %hX\\n", a.n, a.ch, a.m);
    return 0;
}

My computer is little endian,so the answer is:

4, 4
40, @, 40
39, 9, 39

But if the machine is big endian, when a.n=0x40,the a from the lowest to the highest will be 0x00 0x00 0x00 0x40,

when a.ch='9',will the whole 4 bytes be overwrited or just cover the lowest byte and become 0x39 0x00 0x00 0x40?

Upvotes: 0

Views: 425

Answers (3)

John Bollinger
John Bollinger

Reputation: 181199

According to the standard, when a union member is written, all bytes of the union that do not correspond to that member's representation take unspecified values. (C17 6.2.6.1/6-7)

It is possible that those unspecified values can be read back as the same values those locations previously held, but a strictly conforming program cannot rely on that to be the case.

In practice, you might plausibly find (for example) a compiler to implement some or all writes to the example union as four-byte writes from a register to memory, so that indeed, assigning a value to a.ch modifies more than one byte of a.n and a.m.

Upvotes: 1

Eric Postpischil
Eric Postpischil

Reputation: 223795

C 2018 6.2.6.1 7 says:

When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values.

Some points:

  • A C implementation could choose to always assign all bytes of the union.
  • A C implementation could choose to always assign only the bytes of the member assigned and to leave the others unchanged.
  • A C implementation may do different things in different circumstances. For example, if there is an isolated assignment, the compiler might just write the byte(s) of the assigned member to memory. But, if there is an assignment mixed with other uses, and the compiler has bytes from the previously assigned member already in a register, the compiler might optimize the assignment to a register load or copy that changes the whole register and then later write the register to memory, thus overwriting the whole union instead of just the byte(s) of the newly assigned member.
  • In the definition of the C standard, an “unspecified value” is not actually a value at all; it is a sort of amorphous state in which the bytes may have different values each time they are used. This means, once you have assigned a smaller member, the implementation does not actually have to use the other bytes in memory when you attempt to use a different member—it can use whatever bytes are handy, such as whatever is lying in a processor register.

Upvotes: 2

Peter v.d. Vos
Peter v.d. Vos

Reputation: 66

You should use a.n=0x11223344, this way you can see if the compiler will overwrite the total structure or just a part of it.

Upvotes: 0

Related Questions