Reputation: 259
I came across this objective question on the C programming language. The output for the following code is supposed to be 0 2
, but I don't understand why.
Please explain the initialization process. Here's the code:
#include <stdio.h>
int main()
{
union a
{
int x;
char y[2];
};
union a z = {512};
printf("\n%d %d", z.y[0], z.y[1]);
return 0;
}
Upvotes: 27
Views: 2792
Reputation: 106012
The standard says that
A union type describes an overlapping nonempty set of member objects, each of which has an optionally specified name and possibly distinct type.
The compiler allocates only enough space for the largest of the members, which overlay each other within this space. In your case, memory is allocated for int
data type (assuming 4-bytes). The line
union a z = {512};
will initialize the first member of union z
, i.e. x
becomes 512
. In binary it is represented as 0000 0000 0000 0000 0000 0010 0000 0000
on a 32 machine.
Memory representation for this would depend on the machine architecture. On a 32-bit machine it either will be like (store the least significant byte in the smallest address-- Little Endian)
Address Value
0x1000 0000 0000
0x1001 0000 0010
0x1002 0000 0000
0x1003 0000 0000
or like (store the most significant byte in the smallest address -- Big Endian)
Address Value
0x1000 0000 0000
0x1001 0000 0000
0x1002 0000 0010
0x1003 0000 0000
z.y[0]
will access the content at addrees 0x1000
and z.y[1]
will access the content at address 0x1001
and those content will depend on the above representation.
It seems that your machine supports Little Endian representation and therefore z.y[0] = 0
and z.y[1] = 2
and output would be 0 2
.
But, you should note that footnote 95 of section 6.5.2.3 states that
If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.
Upvotes: 8
Reputation: 354
The size of the union is derived by the maximum size to hold a single element of it. So, here it is the size of int.
Assuming it to be 4 bytes/int and 1 bytes/char, we can say: sizeof union a = 4 bytes
.
Now, let's see how it is actually stored in memory:
For example, an instance of the union, a
, is stored at 2000-2003:
2000 -> last(4th / least significant / rightmost) byte of int x, y[0]
2001 -> 3rd byte of int x, y[1]
2002 -> 2nd byte of int x
2003 -> 1st byte of int x (most significant)
Now, when you say z=512:
since z = 0x00000200,
M[2000] = 0x00
M[2001] = 0x02
M[2002] = 0x00
M[2003] = 0x00
So, whey you print, y[0] and y[1], it will print data M[2000] and M[2001] which is 0 and 2 in decimal respectively.
Upvotes: 1
Reputation: 19864
The memory allocated for the union is the size of the largest type in the union, which is int
in this case. Let's say the size of int
on your system is 2 bytes then
512
will be 0x200
.
Represenataion looks like:
0000 0010 0000 0000
| | |
-------------------
Byte 1 Byte 0
So the first byte is 0
and the second one is 2
.(On Little endian systems)
char
is one byte on all systems.
So the access z.y[0]
and z.y[1]
is per byte access.
z.y[0] = 0000 0000 = 0
z.y[1] = 0000 0010 = 2
I am just giving you how memory is allocated and the value is stored.You need to consider the below points since the output depends on them.
Points to be noted:
sizeof(int)
matters, which will vary across the systems.PS: The memory occupied by both the members is the same in union.
Upvotes: 8
Reputation: 11453
I am going to assume that you use a little endian system where sizeof int
is 4 bytes (32 bits)
and sizeof a char
is 1 byte (8 bits)
, and one in which integers are represented in two's complement form. A union
only has the size of its largest member, and all the members point to this exact piece of memory.
Now, you are writing to this memory an integer value of 512
.
512 in binary is 1000000000
.
or in 32 bit two's complement form:
00000000 00000000 00000010 00000000
.
Now convert this to its little endian representation and you'll get:
00000000 00000010 00000000 00000000
|______| |______|
| |
y[0] y[1]
Now see the above what happens when you access it using indices of a char
array.
Thus, y[0]
is 00000000
which is 0
,
and y[1]
is 00000010
which is 2
.
Upvotes: 19
Reputation: 6563
For automatic (non-static) members, the initialization is identical to assignment:
union a z;
z.x = 512;
Upvotes: 0