Reputation: 2946
I'm reading "Learning Core Audio: A Hands-On Guide to Audio Programming for Mac and iOS" by Chris Adamson and at one point the author describes big-endian as:
the high bits of a byte or word are numerically more significant than the lower ones.
However, until now I thought the problem of big-little endian only applies to byte order and not bit order. One byte has the same bit order (left to right) no matter if we're talking about little endian or big endian systems. Am I wrong? Is the author wrong? Or did I misunderstood his point?
Upvotes: 52
Views: 26774
Reputation: 347
Short Answer:
The terms "big-endian" and "little-endian" refer to the order in which (group of bits) (like a 16-bit integer or a 32-bit integer, or even arbitrary bit groups like 11-bit) are stored in memory, They do not affect the order of individual bits within these groups, In a big-endian system, the most significant group is stored first (leftmost in memory), while in a little-endian system, the least significant group is stored first (rightmost in memory), and Bit order within a group is typically consistent across different systems.
I was surprised by the answers I found online, including through Google searches, because they were misleading.
More explained answer:
In a big-endian system, the most significant (Group of bits) is stored first (at the lowest memory address), or we can say that bits are stored to the left side, while in a little-endian system, the least significant (Group of bits) is stored first, or you can say that bits are stored to the right side.
for example:
we have this struct:
struct ByteExample{
uint16_t a; // 2 bytes
uint32_t b; // 4 bytes
uint8_t c; // 1 byte
};
then I will fill it like this:
struct ByteExample example;
example.a = 0x1234;
example.b = 0x56789ABC;
example.c = 0xDE;
We know that, according to the structure we made, a = 2 bytes
, b = 4 bytes
, and c = 1 byte
. The endianness will group these bytes accordingly, but the bit order within each byte (Group of bits) remains unchanged.
so inside the memory, Big-Endian Byte will look like this:
12 34 56 78 9A BC DE
00010010 00110100 01010110 01111000 10011010 10111100 11011110
and Little-Endian Byte will look like this:
34 12 BC 9A 78 56 DE
00110100 00010010 10111100 10011010 01111000 01010110 11011110
As you can see we are ordering the group of data only, not bits, and when you look at the Little-Endian we order from the right side first.
my PC is Little-Endian and here is a real memory data:
Now the confusing part is this: what if I have (11-bit data) or something random that is not dividable by 8, or is dividable by byte size?
I tried to bold the (Group of bit) so many times because I didn't want to limit the concept of "big-endian" and "little-endian" to just a byte, and in the following example you will see how.
let's take a look at this example:
we have this struct:
struct ArbitraryExample {
uint32_t field1 : 5; // 5 bits
uint32_t field2 : 11; // 11 bits
};
then I will fill it like this:
ArbitraryExample arEx;
arEx.field1 = 15; // 5 bits: 0 1111
arEx.field2 = 255; // 11 bits: 000 1111 1111
so inside the memory, Big-Endian Byte will look like this:
78 FF
0111 1000 1111 1111
field1
is the first 5 bits
from the left, then field2
is the next 11 bits
and Little-Endian Byte will look like this:
ef 1f
1110 1111 0001 1111
now this is the confusing part exactly where field1
is the first 5 bits
from the right side of the first byte, and now we have an extra 3 bits
of the first byte, but because it is Little-Endian we will look at it as one part from right to left, so the extra 3 bits
are the low bits, so field2
will be next 11 bits
but we start from second byte and ending at the 3 bits
from first byte.
again I will show the memory to you but with highlights:
ef 1f
111}(0 1111) {0001 1111
( -> start of field1
) -> end of field1
{ -> start of field2
} -> end of field2
my PC is Little-Endian and here is a real memory data for that example:
the first step is filling field1
the second step is filling field2
In summary:
Bit order within a byte or a group of bits is typically consistent across different systems, with the most significant bit (MSB) being the leftmost bit and the least significant bit (LSB) being the rightmost bit. and little endian and big endian is how do we order these groups.
Upvotes: 0
Reputation: 4888
Bit 5 is bit at index 5, not the bit at the address of &byte+5bits. Your computer will understand all bit operations in terms of index bits, not physical bits.
Of course "bit endianness" still matter when you have to guarantee a certain sequence of bits, e.g. in network protocols or over peripheries such as an SPI bus for example, but in this case your library might say something like "data is always sent MSB first" which means each byte is sent with the most significant bit first, and that takes care of that.
Upvotes: 0
Reputation: 39426
This is not an answer to the stated question — it has been well answered by others already —, but a footnote explaining some of the terms in the hope that it will clarify the related concepts. In particular, this is not specific to c at all.
Endianness and byte order
When a value larger than byte is stored or serialized into multiple bytes, the choice of the order in which the component bytes are stored is called byte order, or endian, or endianness.
Historically, there have been three byte orders in use: "big-endian", "little-endian", and "PDP-endian" or "middle-endian".
Big-endian and little-endian byte order names are derived from the way they order the bytes: big-endian puts the most significant byte (the byte that affects the logical value most) first, with successive bytes in decreasing order of significance; and little-endian puts the least significant byte first, with successive bytes in increasing order of significance.
Note that byte order may differ for integer types and floating-point types; they may even be implemented in separate hardware units. On most hardware they do have the same byte order, though.
Bit order
Bit order is very similar concept to endianness, except that it involves individual bits rather than bytes. The two concepts are related, but not the same.
Bit order is only meaningful when bits are serialized, for example via a serial or SPI or I2C bus; one after another.
When bits are referred to in a larger group used in parallel, as one unit, like in a byte or a word, there is no order: there is only labeling and significance. (It is because they are accessed and manipulated as a group, in parallel, rather than serially one by one, that there is no specific order. Their interpretation as a group yields differing significance to each, and us humans can label or number them for ease of reference.)
Bit significance
When a group of bits are treated as a binary value, there is a least significant bit, and a most significant bit. These names are derived from the fact that if you change the least significant bit, the value of the bit group changes by the smallest amount possible; if you change the most significant bit, the value of the bit group changes by the largest amount possible (by a single bit change).
Let's say you have a group of five bits, say a, b, c, d, and e, that form a five-bit unsigned integer value. If a is the most significant, and e the least significant, and the three others are in order of decreasing significance, the unsigned integer value is
value = a·24 + b·23 + c·22 + d·21 + e·20
i.e.
value = 16a + 8b + 4c + 2d + e
In other words, bit significance is derived from the mathematical (or logical) interpretation of a group of bits, and is completely separate from the order in which the bits might be serialized on some bus, and also from any human-assigned labels or numbers.
This is true for all bit groups that logically construct numerical values, even for floating-point numbers.
Bit labels or bit numbering
For ease of reference in documentation for example, it is often useful to label the individual bits. This is essentially arbitrary; and indeed, I used letters a to f in an example above. More often, numbers are easier than letters — it's not that easy to label more than 27 bits with single letters.
There are two approaches to label bits with numbers.
The most common one currently is to label the bits according to their significance, with bit 0 referring to the least significant bit. This is useful, because bit i then has logical value 2i.
On certain architectures' documentation, like IBM's POWER documentation, the most significant bit is labeled 0, in decreasing order of significance. In this case, the logical value of a bit depends on the number of bits in that unit. If an unit has N bits, then bit i has logical value of 2N-i-1.
While this ordering may feel weird, these architectures are all big-endian, and it might be useful for humans to just remember/assume that most significant comes first on these systems.
Remember, however, that this is a completely arbitrary decision, and in both cases the documentation could be written with the other bit labeling scheme, without any effect on the real-world performance of the systems. It is like choosing whether to write from left to right, or from right to left (or top-down, for that matter): the contents are unaffected, as long as you know and understand the convention.
While there is some correlation between byte order and bit labeling, all four of the above concepts are separate.
There is correlation between byte order and bit labeling — in the sense that the documentation for a lot of big-endian hardware uses bit labeling where the most significant bit is bit zero —, but that is only because of choises made by humans.
In c, the order in which the C compiler packs bitfields in a struct, varies between compilers and architectures. It is not specified by the C standard at all. Because of this, it is usually a bad idea to read binary files into a struct type with bitfields. (Even if it works on some specific machine and compiler, there is no guarantee it works on others; often, it does not. So, it definitely makes the code less portable.) Instead, read into a buffer, and array of unsigned char
, and use helper accessor functions to extract the bit fields from the array using bit shifts (<<
, >>
), binary ors (|
), and masking (binary and, &
).
Upvotes: 40
Reputation: 12407
The "endianness" of a byte in terms of bit-order is not really a concern unless you're working with an exotic system that allows you to address bits separately. It can be a concern when deciding how to transmit data over the wire, but this decision is usually made at a hardware level.
In terms of relevance to audio streaming, it could very well be important. The hardware which is responsible for converting the stream of digital audio into analogue audio signals may expect the bits in the stream to be in a particular order. If they're wrong, the sound might come out completely whacked. Perhaps the author of your book elaborates on this? Anyway, as I mentioned earlier, this is normally decided at a hardware level and is not really a concern when programming at a user or even a kernel level. Generally, industry standards will define how two pieces of hardware will transmit the data to each other. As long as all your hardware agrees on the bit endianness, then everything is fine.
Upvotes: 14
Reputation: 3351
The order of bits inside a Byte does not make sense, bits inside a byte are not addressable so you can't define an order of this bits to use it as a reference for a definition of endianness. Unlike bits, bytes are addressable, so there is an address order that we can use as a reference to define what does little or big endian mean.
You may get the impression that Left shift <<
or Right Shift >>
bit-wise operators imply indirectly that there is a defined order of bits inside a byte, but that's not true. This two terms are based on an abstract byte representation where the lowest bit is located at the right and bits get higher in value when going left, but by definition Left shift
has the same effect as multiplication by 2, and Right shift
has the same effect as division by 2 (for unsigned integers).
Upvotes: 9
Reputation: 215577
The only sense in which there is such a thing as "bit order" is the order in which bits are assigned to bitfields. For instance, in:
union {
struct {
unsigned char a:4;
unsigned char b:4;
} bf;
unsigned char c;
};
depending on the implementation, the representation of bf.a
could occupy the high four bits of c
, or the low four bits of c
. Whether the ordering of bitfield members matches the byte order is implementation-defined.
Upvotes: 28
Reputation: 340466
Since you can't normally address the bits within a byte individually, there's no concept of "bit endianness" generally.
Upvotes: 39