Abhishek Ghosh
Abhishek Ghosh

Reputation: 665

"Bit-fields are assigned left to right on some machines and right to left on others"- unable to get the concept from "The C Programming Language" book

I was going through the text "The C Programming Language" by Kernighan and Ritchie. While discussing about bit-fields at the end of that section, the authors say:

"Fields are assigned left to right on some machines and right to left on others. This means that although fields are useful for maintaining internally-defined data structures, the question of which end comes first has to be carefully considered when picking apart externally-defined data; programs that depend on such things are not portable."

- The C Programming Language [2e] by Kernighan & Ritchie [Section 6.9, p.150]

Strictly I do not get the meaning of these lines. Can anyone please explain me with a possible diagram?


PS: Well I have taken a computer organization and architecture course. I know how computers deal with bits and bytes. In a computer system, the smallest unit of information is a single bit which can be either 0 or 1. 8 such bits form a byte. Memories are byte-addressable, which means that each byte in the memory has an address associated with it. But usually, the processors have word lengths as 2 bytes (very old systems),4 bytes, 8 bytes... This means in one memory cycle, the CPU can take up a word length number of bytes from the main memory and put it inside its registers. Now how these bytes are placed in registers depends on the endianness of the system.

But I do not get what the authors mean by "left to right" or "right to left". The words seem like they are related to the endianness but endianness depends on the CPU and C compilers have nothing to do with it... The question which comes to my mind is "left to right" of "what"? What object are the authors referring to?

Upvotes: 0

Views: 973

Answers (3)

Eric Postpischil
Eric Postpischil

Reputation: 222933

When a structure contains bit-fields, the C implementation uses some storage unit to hold them (or multiple storage units if needed). The storage unit might be one eight-bit byte or it might be four bytes, or it might be other sizes—this is a determination made by each C implementation. The C standard only requires that it be addressable, which effectively means it has to be a whole number of bytes.

Once we have a storage unit, it is some number of bits. Say it is 32 bits, and number the bits from 31 to 0, where, if we consider the bits to represent a binary numeral, bit 0 represents 20, and bit 31 represents 231. Note that Kernighan and Ritchie are imprecise to use “left” and “right” here. There is no inherent left or right. We usually write numerals with the most significant digits on the left, so we might consider bit 31 to be the leftmost and bit 0 to be the rightmost.

Now we have a storage unit with some number of bits and some labeling for those bits (31 to 0 or left to right). Say you want to put two bit-fields in them, say fields of width 7 and 5.

Which 7 of the bits from bit 31 to bit 0 are used for the first field? Which 5 of the bits are used for the second field?

We could use bits 31-25 for the first field and bits 24-20 for the second field. Or we could use bits 6-0 for the first field and bits 11-7 for the second field.

In theory, we could also use bits 27-21 for the first field and bits 15-11 for the second field. However, the C standard does say that “If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit” (C 2018 6.7.2.1 11). “Adjacent” is not formally defined, but we can assume it means consecutively numbered bits. So, if the C implementation puts the first field in bits 31-25, it is required to put the second field in bits 24-20. Conversely, it it puts the first field in bits 6-0, it must put the second field in 11-7.

Thus, the C standard requires an implementation to arrange successive bit-fields in a storage unit from left-to-right or from right-to-left, but it does not say which.

(I do not see anything in the standard that says the first field must start at one end of the storage unit or the other, rather than somewhere in the middle. That would lead to wasting some bits.)

Upvotes: 3

Joshua
Joshua

Reputation: 43300

When you write:

struct {
    unsigned int version: 4;
    unsigned int length: 4;
    unsigned char dcsn;

you end up with a big headache you weren't expecting because your code is non-portable.

When you set version to 4 and length to 5, some systems may set the first byte of the structure to 0x45 and other systems may set the first byte of the structure to 0x54.

When I went to college this thing was #ifdef'd as follows (incorrect):

struct {
#if BIG_ENDIAN
    unsigned int version: 4;
    unsigned int length: 4;
#else
    unsigned int length: 4;
    unsigned int version: 4;
#endif
    unsigned char dcsn;

but this is still rolling the dice as there's no rule that the order of the bits in the bytes in a bitfield corresponds to the order of bytes in the word in the machine. I would not be surprised that when you cross-compile the bit order in the struct comes from the host machine's rules while the bit order of integers comes from the target machine's rules (as it must). In theory the code could be corrected by having a separate #ifdef for BIG_ENDIAN_BITFIELD but I've never seen it done.

Upvotes: 3

Yunnosch
Yunnosch

Reputation: 26703

Here is some demonstration code. The only goal is to demonstrate what you are asking about. Clean coding etc. is neglected.

#include <stdio.h>
#include <stdint.h>

union
{
    uint32_t Everything;
    struct 
    {
        uint32_t FirstMentionedBit : 1;
        uint32_t FewOTherBits      :30;
        uint32_t LastMentionedBit  : 1;
    } bitfield;
} Demonstration;

int main()
{
    Demonstration.Everything               =0;
    Demonstration.bitfield.LastMentionedBit=1;
    
    printf("%x\n", Demonstration.Everything);

    Demonstration.Everything                =0;
    Demonstration.bitfield.FirstMentionedBit=1;
    
    printf("%x\n", Demonstration.Everything);

    return 0;
}

If you use this here https://www.tutorialspoint.com/compile_c_online.php the output is

80000000
1

But in other environments it might easily be

1
80000000

This is because compilers are free to consider the first mentioned bit the MSB or the LSB and correspondingly the last mentioned bit to be the LSB or MSB.
And that is what your quote describes.

Upvotes: 2

Related Questions