Reputation: 11
I was just beginning my quest with unions when I found something weird
If I run this program
#include <iostream>
using namespace std;
union myun{
public:
int x;
char c;
};
int main()
{
myun y;
//y.x=65;
y.c='B';
cout<<y.x;
}
The output was some garbage value which doesnt change if change the value of y.c. Next I did this
#include <iostream>
using namespace std;
union myun{
public:
int x;
char c;
};
int main()
{
myun y;
y.x=65;
y.c='B';
cout<<y.x;
}
The output was as expected to be 66 because y.c='B' replaces the 65 by its ASCII value(66). Can anyone explain the first case?
Upvotes: 1
Views: 1613
Reputation: 881333
It's actually undefined behaviour to read from a union member that wasn't the last one written to.
You can do this if the items within the union are layout-compatible (as defined in the standard) but that's not the case here with an int
and char
(more correctly, it could be the case if those two types had similar bit widths, but that's not usually the case).
From the C++03 standard (superceded by C++11 now but still relevant):
In a union, at most one of the data members can be active at any time, that is, the value of at most one of the data members can be stored in a union at any time.
I think you may want to look into reinterpret_cast
if you want to do this sort of overlaying activity.
In terms of what's actually happening under the covers in the first one, the hex value of the number output:
-1218142398 (signed) -> 3076824898 (unsigned) -> B7649F42 (hex)
==
^^
||
0x42 is 'B' ----++
should provide a clue. The y.c='B'
is only setting the single byte of that structure, leaving the other three bytes (in my case) as indeterminate.
By putting in the y.x=65
line before that point, it's setting all four bytes, with those three spare ones being set to zero. Hence they stay at zero when you set the single byte in the following assignment.
Upvotes: 4
Reputation: 321
The variable y is of the union type, and the y's length is four bytes. For instance, y's memory layout is like this:
---------------------------------
| byte1 | byte2 | byte3 | byte4 |
---------------------------------
1) In the first program, the sentence y.c='B';
just set byte1, but byte2, byte3, byte4 are random values in the stack.
2) In the second program, the sentence y.x=65; set byte1 as 65 , the byte2, byte3, byte4 is zero. Then, the sentence y.c='B';
set byte1 as the integer ASCII value of 'B', hence giving an output of 66.
Upvotes: 0
Reputation: 51226
Local variables (more specifically variables on the stack, i.e. having storage class "automatic") of POD type aren't initialised to anything when they are declared, so the 3 bytes (or 7 bytes on a 64-bit system) not affected by your assignment to y.c
will contain random garbage.
Also note that the particular byte affected by the assignment to y.c
depends on the endianness of the CPU, so this code will behave differently on different systems even if you initialise y.x
before assigning to y.c
.
Upvotes: 0
Reputation: 4114
In a union, memory allocated is equal to the size of the largest member,which in your case is int i.e. 2 bytes in case of 16-bit compiler. All members use the same memory space to store their data, hence practically, only one type of member can be stored at a time.
When you assigned the value 'B' to the char member it stored 66 in its memory space of 1 byte. Then you tried to output the value of the int member which however tried to compute a value by reading values from 2 bytes of the memory, hence you got a garbage value.
Upvotes: 1
Reputation: 12044
Because sizeof(int) != sizeof(char)
.
That is to say, an integer and a character take up different amounts of memory (in the average computer these days, int is 4 bytes, char is 1 byte). The union is only as large as it's largest member. Thus, when you set the char, you only set 1 byte of memory - the other 3 bytes are just random garbage.
Either set the biggest member of the union first, or do something like:
memset(&y, 0, sizeof(y));
to fill the entire union with zero.
Upvotes: 1
Reputation: 234424
y.c='B';
cout<<y.x;
This has undefined behaviour. At any given time, union contains only one of its members. You cannot try to read the int
member if it actually contains the char
member. Because the behaviour of this is not defined the compiler is allowed to do what it wants with the code.
Upvotes: 1
Reputation: 63471
Well, you kinda explained the first case when you showed your understanding of the second case.
Initialising the character part only modifies one byte in a datatype that provides int
. Assuming 32-bit int, that means 3 bytes are still uninitialised... Hence the garbage.
Here's the memory usage of your union:
byte
0 1 2 3
+------------
myun::x | X X X X
myun::c | X - - -
When you set x
, you set an integer, so all remaining bytes are initialised. When you set c
, you only modify a single byte.
Upvotes: 1