Reputation: 2750
What are the underlying transformations that are necessary to convert data in a little-endian system into network byte order? For 2 byte and 4 byte data there are well-known functions (such as htons, ntohl, etc.) to encapsulate the changes, what happens for strings of 1 byte data (if anything)?
Also, Wikipedia implies that little-endian is the mirror image of big-endian, but if that were true why would we need specific handling for 2 and 4 byte data?
The essay "On Holy Wars and a Plea for Peace" seems to imply that there are many different flavors of little-endian -- it's an old essay -- does that still apply? Are byte order markers like the ones found at the beginning of Java class files still necessary?
And finally, is 4-byte alignment necessary for network-byte order?
Upvotes: 7
Views: 709
Reputation: 44804
The basic idea is that all multi-byte types have to have the order of their bytes reversed. A four byte integer would have bytes 0 and 3 swapped, and bytes 1 and 2 swapped. A two byte integer would have bytes 0 and 1 swapped. A one byte character does not get swapped.
There are two very important implications of this that non-practicioners and novices don't always realise:
Upvotes: 0
Reputation: 308206
is 4-byte alignment necessary for network-byte order?
No specific alignment is necessary for bytes going over a network. Your processor may demand a certain alignment in memory, but it's up to you to resolve the discrepancy. The x86 family usually doesn't make such demands.
Upvotes: 0
Reputation: 25677
1 byte data doesn't require any conversion between endians (it's an advantage of UTF-8 over UTF-16 and UTF-32 for string encoding).
Upvotes: 0
Reputation: 7712
Specific handling functions for 2 and 4 byte data take advantage of the fact that there are processor instructions that operate on specific data sizes. Running a 1-byte reversing function four times is certainly less efficient than using wider instructions to perform the same (albeit increased in scale) operations on all four bytes at once.
Upvotes: 0
Reputation: 41180
Let's say you have the ASCII text "BigE" in an array b
of bytes.
b[0] == 'B'
b[1] == 'i'
b[2] == 'g'
b[3] == 'E'
This is network order for the string as well.
If it was treated as a 32 bit integer, it would be
'B' + ('i' << 8) + ('g' << 16) + ('E' << 24)
on a little endian platform and
'E' + ('g' << 8) + ('i' << 16) + ('B' << 24)
on a big endian platform.
If you convert each 16-bit work separately, you'd get neither of these
'i' + ('B' << 8) + ('E' << 16) + ('g' << 24)
which is why ntohl
and ntohs
are both required.
In other words, ntohs
swaps bytes within a 16-bit short, and ntohl
reverses the order of the four bytes of its 32-bit word.
Upvotes: 6