Aditya Sehgal
Aditya Sehgal

Reputation: 2893

Does endianess matter when reading N bits from unsigned char

I am trying to read a sequence of bits received from network (in a pre-defined format) and was wondering do we have to take care of endianees.

For example, the pre-defined format says that starting from most significant bit the data received would look like this

 |R||  11 bits data||20 bits data||16 bits data| where R is reserved and ignored.

My questions is while extracting do I have to take care of endianess or can I just do

u16 first_11_bits = *(u16 *)data & 0x7FF0) >>4
u32 20_bits_data  = *(u32 *)data & 0x000FFFFF)

Upvotes: 0

Views: 146

Answers (3)

Art
Art

Reputation: 20402

What kind of network? IP is defined in terms of bytes so whatever order the bit stream happens to be in the underlying layers has been abstracted away from you and you receive the bits in the order that your CPU understands them. Which means that the abstraction that C provides you to access those bits is portable. Think in terms of shifting left or right in C. Whatever the endianness is in the CPU the direction and semantics of shifting in C doesn't change.

So the question is: how is the data encoded into a byte stream by the other end? However the other end encodes the data should be the way you decode it. If they just shove bits into one byte and send that byte over the network, then you don't need to care. If they put bits into one int16 and then send it in network byte order, then you need to worry endianness of that int16. If they put the bits into an int32 and send that, then you need to worry about endianness of that int32.

Upvotes: 1

David Schwartz
David Schwartz

Reputation: 182885

u16 first_11_bits = *(u16 *)data & 0x7FF0) >>4
u32 20_bits_data  = *(u32 *)data & 0x000FFFFF)

This is UB. Either data points to a u16 or it points to a u32. It can't point to both. This isn't an endianess issue, it's just an invalid access. You must access for read through the same pointer type you accessed for write. Whichever way you wrote it, that's how you read it. Then you can do bit operations on the value you read back, which will be the same as the value you wrote.

One way this can go wrong is that the compiler is free to assume that a write through a u32 * won't affect a value read through a u16 *. So the write may be to memory but the read may be from a value cached in a register. This has broken real-world code.

For example:

u16 i = * (u16 *) data;
* (u32 *) data = 0;
u16 j = * (u16 *) data;

The compiler is free to treat the last line as u16 j = i;. It last read i the very same way, and it is free to assume that a write to a u32 * can't affect the result of a read from a u16 *.

Upvotes: 0

user694733
user694733

Reputation: 16047

Yes, you will always need to worry about endianness when reading/writing to external resource (file, network, ...). But it has nothing to do with the bit operations.

Casting directly (u16 *)data is not portable way to do things.

I recommend having functions to convert data to native types, before doing bit operations.

Upvotes: 0

Related Questions