Reputation: 83
My code is this
// using_a_union.cpp
#include <stdio.h>
union NumericType
{
int iValue;
long lValue;
double dValue;
};
int main()
{
union NumericType Values = { 10 }; // iValue = 10
printf("%d\n", Values.iValue);
Values.dValue = 3.1416;
printf("%d\n", Values.iValue); // garbage value
}
Why do I get garbage value when I try to print Values.iValue
after doing Values.dValue = 3.1416
?
I thought the memory layout would be like this. What happens to Values.iValue
and
Values.lValue;
when I assign something to Values.dValue
?
Upvotes: 8
Views: 1308
Reputation: 355367
In a union
, all of the data members overlap. You can only use one data member of a union at a time.
iValue
, lValue
, and dValue
all occupy the same space.
As soon as you write to dValue
, the iValue
and lValue
members are no longer usable: only dValue
is usable.
Edit: To address the comments below: You cannot write to one data member of a union and then read from another data member. To do so results in undefined behavior. (There's one important exception: you can reinterpret any object in both C and C++ as an array of char
. There are other minor exceptions, like being able to reinterpret a signed integer as an unsigned integer.) You can find more in both the C Standard (C99 6.5/6-7) and the C++ Standard (C++03 3.10, if I recall correctly).
Might this "work" in practice some of the time? Yes. But unless your compiler expressly states that such reinterpretation is guaranteed to be work correctly and specifies the behavior that it guarantees, you cannot rely on it.
Upvotes: 9
Reputation: 124800
Because floating point numbers are represented differently than integers are.
All of those variables occupy the same area of memory (with the double occupying more obviously). If you try to read the first four bytes of that double as an int you are not going to get back what you think. You are dealing with raw memory layout here and you need to know how these types are represented.
EDIT: I should have also added (as James has already pointed out) that writing to one variable in a union and then reading from another does invoke undefined behavior and should be avoided (unless you are re-interpreting the data as an array of char).
Upvotes: 7
Reputation: 106244
You've done this:
union NumericType Values = { 10 }; // iValue = 10
printf("%d\n", Values.iValue);
Values.dValue = 3.1416;
How a compiler uses memory for this union is similar to using the variable with largest size and alignment (any of them if there are several), and reinterpret cast when one of the other types in the union is written/accessed, as in:
double dValue; // creates a variable with alignment & space
// as per "union Numerictype Values"
*reinterpret_cast<int*>(&dValue) = 10; // separate step equiv. to = { 10 }
printf("%d\n", *reinterpret_cast<int*>(dValue)); // print as int
dValue = 3.1416; // assign as double
printf("%d\n", *reinterpret_cast<int*>(dValue)); // now print as int
The problem is that in setting dValue to 3.1416 you've completely overwritten the bits that used to hold the number 10. The new value may appear to be garbage, but it's simply the result of interpreting the first (sizeof int) bytes of the double 3.1416, trusting there to be a useful int value there.
If you want the two things to be independent - so setting the double doesn't affect the earlier-stored int - then you should use a struct
/class
.
It may help you to consider this program:
#include <iostream>
void print_bits(std::ostream& os, const void* pv, size_t n)
{
for (int i = 0; i < n; ++i)
{
uint8_t byte = static_cast<const uint8_t*>(pv)[i];
for (int j = 0; j < 8; ++j)
os << ((byte & (128 >> j)) ? '1' : '0');
os << ' ';
}
}
union X
{
int i;
double d;
};
int main()
{
X x = { 10 };
print_bits(std::cout, &x, sizeof x);
std::cout << '\n';
x.d = 3.1416;
print_bits(std::cout, &x, sizeof x);
std::cout << '\n';
}
Which, for me, produced this output:
00001010 00000000 00000000 00000000 00000000 00000000 00000000 00000000
10100111 11101000 01001000 00101110 11111111 00100001 00001001 01000000
Crucially, the first half of each line shows the 32 bits that are used for iValue: note the 1010 binary in the least significant byte (on the left on an Intel CPU like mine) is 10 decimal. Writing 3.1416 changes the entire 64-bits to a pattern representing 3.1416 (see http://en.wikipedia.org/wiki/Double_precision_floating-point_format). The old 1010 pattern is overwritten, clobbered, an electromagnetic memory no more.
Upvotes: 0
Reputation: 283355
Well, let's just look at simpler example first. Ed's answer describes the floating part, but how about we examine how ints and chars are stored first!
Here's an example I just coded up:
#include "stdafx.h"
#include <iostream>
using namespace std;
union Color {
int value;
struct {
unsigned char R, G, B, A;
};
};
int _tmain(int argc, _TCHAR* argv[])
{
Color c;
c.value = 0xFFCC0000;
cout << (int)c.R << ", " << (int)c.G << ", " << (int)c.B << ", " << (int)c.A << endl;
getchar();
return 0;
}
What would you expect the output to be?
255, 204, 0, 0
Right?
If an int is 32 bits, and each of the chars is 8 bits, then R should correspond to the to the left-most byte, G the second one, and so forth.
But that's wrong. At least on my machine/compiler, it appears ints are stored in reverse byte order. I get,
0, 0, 204, 255
So to make this give the output we'd expect (or the output I would have expected anyway), we have to change the struct to A,B,G,R
. This has to do with endianness.
Anyway, I'm not an expert on this stuff, just something I stumbled upon when trying to decode some binaries. The point is, floats aren't necessarily encoded the way you'd expect either... you have to understand how they're stored internally to understand why you're getting that output.
Upvotes: 2