Reputation: 43
This isn't cross-platform code... everything is being performed on the same platform (i.e. endianess is the same.. little endian).
I have this code:
unsigned char array[4] = {'t', 'e', 's', 't'};
unsigned int out = ((array[0]<<24)|(array[1]<<16)|(array[2]<<8)|(array[3]));
std::cout << out << std::endl;
unsigned char buff[4];
memcpy(buff, &out, sizeof(unsigned int));
std::cout << buff << std::endl;
I'd expect the output of buff to be "test" (with a garbage trailing character because of the lack of '/0') but instead the output is "tset." Obviously changing the order of characters that I'm shifting (3, 2, 1, 0 instead of 0, 1, 2, 3) fixes the problem, but I don't understand the problem. Is memcpy not acting the way I expect?
Thanks.
Upvotes: 4
Views: 4477
Reputation: 146073
You have written a test for platform byte order, and it has concluded: little endian.
Upvotes: 0
Reputation: 320481
On a little-endian platform the output should be tset
. The original sequence was test
from lower addresses to higher addresses. Then you put it into an unsigned int
with first 't' going into the most significant byte and the last 't' going into the least significant byte. On a little-endian machine the least significant byte is stored at lower address. This is how it will be copied to the final buf
. This is how it is going to be output: from the last 't' to the first 't', i.e. tset
.
On a big-endian machine you would not observe the reversal.
Upvotes: 0
Reputation: 993085
This is because your CPU is little-endian. In memory, the array is stored as:
+----+----+----+----+
array | 74 | 65 | 73 | 74 |
+----+----+----+----+
This is represented with increasing byte addresses to the right. However, the integer is stored in memory with the least significant bytes at the left:
+----+----+----+----+
out | 74 | 73 | 65 | 74 |
+----+----+----+----+
This happens to represent the integer 0x74657374. Using memcpy()
to copy that into buff
reverses the bytes from your original array
.
Upvotes: 9
Reputation: 206816
You're running this on a little-endian platform.
On a little-endian platform, a 32-bit int is stored in memory with the least significant byte in the lowest memory address. So bits 0-7 are stored at address P, bits 8-15 in address P + 1, bits 16-23 in address P + 2 and bits 24-31 in address P + 3.
In your example: bits 0-7 = 't', bits 8-15 = 's', bits 16-23 = 'e', bits 24-31 = 't'
So that's the order that the bytes are written to memory: "tset"
If you address the memory then as separate bytes (unsigned chars), you'll read them in the order they are written to memory.
Upvotes: 2