Sumit Cornelius
Sumit Cornelius

Reputation: 259

Initialization of a union in C

I came across this objective question on the C programming language. The output for the following code is supposed to be 0 2, but I don't understand why.

Please explain the initialization process. Here's the code:

#include <stdio.h>

int main()
{
  union a
  {
    int x;
    char y[2];
  };
  union a z = {512};
  printf("\n%d %d", z.y[0], z.y[1]);
  return 0;
}

Upvotes: 27

Views: 2792

Answers (5)

haccks
haccks

Reputation: 106012

The standard says that

6.2.5 Types:

A union type describes an overlapping nonempty set of member objects, each of which has an optionally specified name and possibly distinct type.

The compiler allocates only enough space for the largest of the members, which overlay each other within this space. In your case, memory is allocated for int data type (assuming 4-bytes). The line

union a z = {512};

will initialize the first member of union z, i.e. x becomes 512. In binary it is represented as 0000 0000 0000 0000 0000 0010 0000 0000 on a 32 machine.

Memory representation for this would depend on the machine architecture. On a 32-bit machine it either will be like (store the least significant byte in the smallest address-- Little Endian)

Address     Value
0x1000      0000 0000
0x1001      0000 0010
0x1002      0000 0000 
0x1003      0000 0000

or like (store the most significant byte in the smallest address -- Big Endian)

Address     Value
0x1000      0000 0000
0x1001      0000 0000
0x1002      0000 0010 
0x1003      0000 0000

z.y[0] will access the content at addrees 0x1000 and z.y[1] will access the content at address 0x1001 and those content will depend on the above representation.
It seems that your machine supports Little Endian representation and therefore z.y[0] = 0 and z.y[1] = 2 and output would be 0 2.

But, you should note that footnote 95 of section 6.5.2.3 states that

If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.

Upvotes: 8

shreyans800755
shreyans800755

Reputation: 354

The size of the union is derived by the maximum size to hold a single element of it. So, here it is the size of int.

Assuming it to be 4 bytes/int and 1 bytes/char, we can say: sizeof union a = 4 bytes.

Now, let's see how it is actually stored in memory:

For example, an instance of the union, a, is stored at 2000-2003:

  • 2000 -> last(4th / least significant / rightmost) byte of int x, y[0]

  • 2001 -> 3rd byte of int x, y[1]

  • 2002 -> 2nd byte of int x

  • 2003 -> 1st byte of int x (most significant)

Now, when you say z=512:

since z = 0x00000200,

  • M[2000] = 0x00

  • M[2001] = 0x02

  • M[2002] = 0x00

  • M[2003] = 0x00

So, whey you print, y[0] and y[1], it will print data M[2000] and M[2001] which is 0 and 2 in decimal respectively.

Upvotes: 1

Gopi
Gopi

Reputation: 19864

The memory allocated for the union is the size of the largest type in the union, which is intin this case. Let's say the size of int on your system is 2 bytes then

512 will be 0x200.

Represenataion looks like:

0000 0010 0000 0000
|        |         |
------------------- 
Byte 1     Byte 0

So the first byte is 0 and the second one is 2.(On Little endian systems)

char is one byte on all systems.

So the access z.y[0] and z.y[1] is per byte access.

z.y[0] = 0000 0000 = 0
z.y[1] = 0000 0010 = 2

I am just giving you how memory is allocated and the value is stored.You need to consider the below points since the output depends on them.

Points to be noted:

  1. The output is completely system dependent.
  2. The endianess and the sizeof(int) matters, which will vary across the systems.

PS: The memory occupied by both the members is the same in union.

Upvotes: 8

Arjun Sreedharan
Arjun Sreedharan

Reputation: 11453

I am going to assume that you use a little endian system where sizeof int is 4 bytes (32 bits) and sizeof a char is 1 byte (8 bits), and one in which integers are represented in two's complement form. A union only has the size of its largest member, and all the members point to this exact piece of memory.

Now, you are writing to this memory an integer value of 512.

512 in binary is 1000000000.

or in 32 bit two's complement form:

00000000 00000000 00000010 00000000.

Now convert this to its little endian representation and you'll get:

00000000 00000010 00000000 00000000
|______| |______|
   |         |
  y[0]      y[1]

Now see the above what happens when you access it using indices of a char array.

Thus, y[0] is 00000000 which is 0,

and y[1] is 00000010 which is 2.

Upvotes: 19

i486
i486

Reputation: 6563

For automatic (non-static) members, the initialization is identical to assignment:

union a z;
z.x = 512;

Upvotes: 0

Related Questions