Koto
Koto

Reputation: 511

Does endianness affect writing an odd number of bytes?

Imagine you had a uint64_t bytes and you know that you only need 7 bytes because the integers you store will not exceed the limit of 7 bytes.

When writing a file you could do something like

std::ofstream fout(fileName); fout.write((char *)&bytes, 7);

to only write 7 bytes.

The question I'm trying to figure out is whether endianess of a system affects the bytes that are written to the file. I know that endianess affects the order in which the bytes are written, but does it also affect which bytes are written? (Only for the case when you write less bytes than the integer usually has.)

For example, on a little endian system the first 7 bytes are written to the file, starting with the LSB. On a big endian system what is written to the file?

Or to put it differently, on a little endian system the MSB(the 8th byte) is not written to the file. Can we expect the same behavior on a big endian system?

Upvotes: 1

Views: 720

Answers (3)

Pablo Santa Cruz
Pablo Santa Cruz

Reputation: 181290

Endianess affects only the way (16, 32, 64) int are written. If you are writing bytes, (as it is your case) they will be written in the exact same order you are doing it.

For example, this kind of writing will be affected by endianess:

std::ofstream fout(fileName);
int i = 67;
fout.write((char *)&i, sizeof(int));

Upvotes: 2

Aconcagua
Aconcagua

Reputation: 25526

uint64_t bytes = ...;
fout.write((char *)&bytes, 7);

This will write exactly 7 bytes starting from the address of &bytes. There is a difference between LE and BE systems how the eight bytes in memory are laid out, though (let's assume the variable is located at address 0xff00):

            0xff00  0xff01  0xff02  0xff03  0xff04  0xff05  0xff06  0xff07
LE: [byte 0 (LSB!)][byte 1][byte 2][byte 3][byte 4][byte 5][byte 6][byte 7 (MSB)]
BE: [byte 7 (MSB!)][byte 6][byte 5][byte 4][byte 3][byte 2][byte 1][byte 0 (LSB)]

Starting address (0xff00) won't change if casting to char*, and you'll print out the byte at exactly this address plus the next six following ones – in both cases (LE and BE), address 0xff07 won't be printed. Now if you look at my memory table above, it should be obvious that on BE system, you lose the LSB while storing the MSB, which does not carry information...

On a BE-System, you could instead write fout.write((char *)&bytes + 1, 7);. Be aware, though, that this yet leaves a portability issue:

fout.write((char *)&bytes + isBE(), 7);
//                           ^ giving true/false, i. e. 1 or 0
// (such function/test existing is an assumption!)

This way, data written by a BE-System would be misinterpreted by a LE-system, when read back, and vice versa. Safe version would be decomposing each single byte as geza did in his answer. To avoid multiple system calls, you might decompose the values into an array instead and print out that one.

If on linux/BSD, there's a nice alternative, too:

bytes = htole64(bytes); // will likely result in a no-op on LE system...
fout.write((char *)&bytes, 7);

Upvotes: 1

geza
geza

Reputation: 29962

The question I'm trying to figure out is whether endianess of a system affects the bytes that are written to the file.

Yes, it affects the bytes are written to the file.

For example, on a little endian system the first 7 bytes are written to the file, starting with the LSB. On a big endian system what is written to the file?

The first 7 bytes are written to the file. But this time, starting with the MSB. So, in the end, the lowest byte is not written in the file, because on big endian systems, the last byte is the lowest byte.

So, this is not what you've wanted, because you lose information.

A simple solution is to convert uint64_t to little endian, and write the converted value. Or just write the value byte-by-byte in a way that a little endian system would write it:

uint64_t x = ...;

write_byte(uint8_t(x));
write_byte(uint8_t(x>>8));
write_byte(uint8_t(x>>16));
// you get the idea how to write the remaining bytes

Upvotes: 1

Related Questions