M. Church
M. Church

Reputation: 39

Union of structs with only bit fields, sizeof function doubling bytes, C

For some reason that i cant quite figure out my union of just structs containing bit fields is setting up twice as many bytes as is are necessary for any single struct.

#include <stdio.h>
#include <stdlib.h>

union instructionSet {
    struct Brane{
        unsigned int opcode: 4;
        unsigned int address: 12;
    } brane;
    struct Cmp{
        unsigned int opcode: 4;
        unsigned int blank: 1;
        unsigned int rsvd: 3;
        unsigned char letter: 8;
    } cmp;
    struct {
        unsigned int rsvd: 16;
    } reserved;
};

int main() {

    union instructionSet IR;// = (union instructionSet*)calloc(1, 2);

    printf("size of union %ld\n", sizeof(union instructionSet));
    printf("size of reserved %ld\n", sizeof(IR.reserved));
    printf("size of brane %ld\n", sizeof(IR.brane));
    printf("size of brane %ld\n", sizeof(IR.cmp));


    return 0;
}

All of the calls to sizeof return 4 however to my knowledge they should be returning 2.

Upvotes: 3

Views: 391

Answers (4)

Igor Galczak
Igor Galczak

Reputation: 142

Read about memory structure padding / memory alignment. By default 32bit processor read from memory by 32bit (4bytes) because is faster. So in memory char + uint32 will be write on 4 + 4 = 8 bytes (1byte - char, 3bytes space, 4bytes uint32).

Add those lines on begin and end of your program and will be result 2.

#pragma pack(1)

#pragma unpack

This is way to say to the compiler: align memory to 1 byte (by default 4 on 32bit processor).

PS: try this example with different #pragma pack set:

struct s1 
{
    char a;
    char b;
    int c;
};

struct s2
{    
    char b;
    int c;
    char a;
};

int main() {
    printf("size of s1 %ld\n", sizeof(struct s1));
    printf("size of s2 %ld\n", sizeof(struct s2));

    return 0;
}

Upvotes: 1

Lundin
Lundin

Reputation: 214525

It isn't specified what this code will do and it isn't meaningful to reason about it without a specific system and compiler in mind. Bit-fields are simply too poorly specified in the standard to be reliably used for things like memory layouts.

union instructionSet {

    /* any number of padding bits may be inserted here */ 

    /* we don't know if what will follow is MSB or LSB */

    struct Brane{
        unsigned int opcode: 4; 
        unsigned int address: 12;
    } brane;
    struct Cmp{
        unsigned int opcode: 4;
        unsigned int blank: 1;
        unsigned int rsvd: 3;
        /* anything can happen here, "letter" can merge with the previous 
           storage unit or get placed in a new storage unit */
        unsigned char letter: 8; // unsigned char does not need to be supported
    } cmp;
    struct {
        unsigned int rsvd: 16;
    } reserved;

    /* any number of padding bits may be inserted here */ 
};

The standard lets the compiler pick a "storage unit" for any bit-field type, which can be of any size. The standard simply states:

An implementation may allocate any addressable storage unit large enough to hold a bitfield.

Things we can't know:

  • How large the bitfields of type unsigned int are. 32 bits might make sense but no guarantee.
  • If unsigned char is allowed for bit-fields.
  • How large the bitfields of type unsigned char are. Could be any size from 8 to 32.
  • What will happen if the compiler picked a smaller storage unit than the expected 32 bits, and the bits doesn't fit inside it.
  • What happens if an unsigned int bit-field meets an unsigned char bit-field.
  • If there will be padding in the end of the union or in the beginning (alignment).
  • How individual storage units within the structs are aligned.
  • The location of the MSB.

Things we can know:

  • We have created some sort of binary blob in memory.
  • The first byte of the blob resides on the least significant address in memory. It may contain data or padding.

Further knowledge can be obtained by having a very specific system and compiler in mind.


Instead of the bit-fields we can use 100% portable and deterministic bitwise operations, that yield the same machine code anyway.

Upvotes: 1

smsisko
smsisko

Reputation: 54

There are a couple of problems here, first of all, your bitfield Brane is using unsigned int which is 4 byte.

Even if you just use half of the bits, you still use a full 32-bit width unsigned int.

Second, your Cmp bitfields uses two different field types, so you use 8-bit of the 32-bit unsigned int for your first 3 fields, and then you use a unsigned char for it's full 8-bit. Because of data alignement rules, this structure would be at least 6 bytes, but potentially more.

If you wanted to optimize the size of your union to only take 16-bit. Your first need to use unsigned short and then you need to always use the same field type to keep everything in the same space.

Something like this would fully optimize your union:

union instructionSet {
    struct Brane{
        unsigned short opcode: 4;
        unsigned short address: 12;
    } brane;
    struct Cmp{
        unsigned short opcode: 4;
        unsigned short blank: 1;
        unsigned short rsvd: 3;
        unsigned short letter: 8;
    } cmp;
    struct {
        unsigned short rsvd: 16;
    } reserved;
};

This would give you a size of 2 all around.

Upvotes: 2

Eric Postpischil
Eric Postpischil

Reputation: 223737

C 2018 6.7.2.1 11 allows the C implementation to choose the size of the container is uses for bit-fields:

An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined.…

The implementation you are using apparently chooses to use four-byte units. Likely that is also the size of an int in the implementation, suggesting that it is a convenient size for the implementation.

Upvotes: 2

Related Questions