Anton
Anton

Reputation: 93

Clarification - Struct Bitfield memory layout

Consider the below structure:

    typedef struct football_game {
        unsigned short      num_players  : 4;
        unsigned int        num_managers : 1;
        unsigned short      num_cameras  : 8;
        long int            num_vip      : 16;
        unsigned int        num_screens  : 4;
    } statistics;

// The following values are assigned
    week_a.num_players = 0xf, week_a.num_managers = 1; 
    week_a.num_cameras = 0xff, week_a.num_vip = 0xfff;
    week_a.num_screens = 0xf;

ABI info, please note my system is little endian

file a.out
a.out: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=xxx for GNU/Linux 3.2.0, with debug_info, not stripped

I have gone through a few articles,

Based on my current understanding I have drawn the below layout

Case 1: without using __attribute__((packed)) to the struct

How I think the memory is laid out:

# Arranged in Little endian, read the bits from right to left
0x7fffffffdf88 0xff = num_cameras(3 bits) + num_managers(1 bit) + num_players(4 bits)
0x7fffffffdf89 0xff = num_vip(3 bits) + num_cameras(5 bits)
0x7fffffffdf8a 0xff = num_vip(8 bits)
0x7fffffffdf8b 0xe1 = num_screens(3 bits) + num_vip(5 bits (of which 4 will be 0))
0x7fffffffdf8c 0x01 = (7 bits are 0) + num_screens(1 bit)
0x7fffffffdf8d 0x00 = 8 bits padding
0x7fffffffdf8e 0x00 = 8 bits padding
0x7fffffffdf8f 0x00 = 8 bits padding

How GDB gives the output:

0x7fffffffdf88: 0xff    0xff    0xff    0x01    0x0f    0x00    0x00    0x00
# Below for clarity
0x7fffffffdf88 0xff = num_cameras(3 bits) + num_managers(1 bit) + num_players(4 bits)
0x7fffffffdf89 0xff = num_vip(3 bits) + num_cameras(5 bits)
0x7fffffffdf8a 0xff = num_vip(8 bits)
0x7fffffffdf8b 0x01 = (4 bits are 0) + num_vip(5 bits (of which 4 are 0))
0x7fffffffdf8c 0x0f = (4 bits are 0) + num_screens(4 bits)
0x7fffffffdf8d 0x00 = 8 bits padding
0x7fffffffdf8e 0x00 = 8 bits padding
0x7fffffffdf8f 0x00 = 8 bits padding

Case 2: Same struct if __attribute__((packed)) is used

The output of GDB

sizeof week_a : 5
# This is expected since we are using packed.
(gdb) x/5bx &week_a
0x7fffffffdf8b: 0xff    0xff    0xff    0xe1    0x01

I'm a beginner and need some help in understanding how the bits are packed here ? Is my understanding proper do bits need packing I don't think so, I'm not sure

Question

Why did the compiler allocate separate byte for num_screens at 0x7fffffffdf8c in Case 1, please let me know ? because I assumed since size of struct is 8 byte (long int is 8 byte) it can accumulate all bitfields in those 64 bits.

Also it would be helpful if you could point me out some proper rules based on ABI specific or in an ideal case because I'm thinking how bits are allocated as it depends on compiler and machine implementation specific


Thanks.

Upvotes: 3

Views: 126

Answers (1)

Nate Eldredge
Nate Eldredge

Reputation: 58653

First, note that within ISO C, the only declared types that are defined for bit-fields are bool and (signed/unsigned) int. The use of short, long, etc are implementation-defined.

Your system follows the x86-64 SysV ABI, which specifies the layout of bit-fields in section 3.1.2, "Data Representation". The relevant rules are:

• bit-fields are allocated from right to left

•bit-fields must be contained in a storage unit appropriate for its declared type

Here, "right to left" means from least to most significant bits.

The second rule means, for instance, that when a bit-field is declared as short (which is a 16-bit type on this platform), then that bit-field will fit inside a single aligned 16-bit storage unit (word) within the struct. "Aligned" means that the address of the word must be a multiple of its size (2 bytes). So it would in particular force the required alignment of the struct, and the offset of the word within the struct, to both be multiples of 2.

Thus in your example, unsigned int num_screens : 4 must fit inside an aligned 32-bit unit (dword). Your proposed layout would not satisfy this: you would have it spanning between the dword at address 0x7fffffffdf88 (offset 0) and the dword at address 0x7fffffffdf8c (offset 4). You could say it fits inside the dword at offset 3, but that doesn't count as it's not properly aligned. So what the compiler does instead is put it at the least significant bits of the dword at offset 4, leaving the last 3 bits of the previous byte unused.

Another way to think about this rule is to think of the bit-fields as all living within a single array of bits, numbered increasing from 0. So bit 0 is the least-significant bit of the byte at offset 0, etc. (This fits nicely with x86-64 being little-endian; bit 26 can be thought of as bit 26 of a 32-bit unit at offset 0, or as bit 10 of a 16-bit unit at offset 2, or as bit 2 of a byte at offset 3.)

From this point of view, the second rule above says that a bit-field declared as short must correspond to bits that do not cross through a multiple of 16, and so on.

So let's apply this analysis to your struct.

  • unsigned short num_players : 4 can occupy bits 0 through 3. This does not cross any multiple of 16 so that's fine.
  • unsigned int num_managers : 1 occupies bit 4. The rule is that it should not cross a multiple of 32, which is trivially satisfied.
  • unsigned short num_cameras : 8 occupies bits 5 through 12. Again, no multiple of 16 is crossed.
  • long int num_vip : 16 occupies bits 13 through 28. Since long int is 64 bits, the only requirement here is that a multiple of 64 should not be crossed, and that's satisfied.
  • unsigned int num_screens : 4 cannot occupy bits 29 through 32 as you propose, because it would cross a 32-bit boundary. (You can think of the boundary as being between bits 31 and 32.) Instead, it must occupy bits 32 through 35.

Upvotes: 2

Related Questions