Portable and Tight Bit Packing

Question

Suppose I have three unsigned ints, {a, b, c, d}, which I want to pack with non-standard lengths, {9,5,7,11} respectively. I wish to make a network packet (unsigned char pkt[4]) that I can pack these values into and unpack them reliably on another machine using the same header file regardless of endianness.

Everything I have read about using packed structs suggests that the bit-ordering will not be predictable so that is out of the question. So that leaves me with bit-set and bit-clear operations, but I'm not confident in how to ensure that endianness will not cause me problems. Is the following sufficient, or shall I run into problems with the endianness of a and d separately?

void pack_pkt(uint16_t a, uint8_t b, uint8_t c, uint16_t d, uint8_t *pkt){
    uint32_t pkt_h = ((uint32_t)a & 0x1FF)      // 9 bits
                 | (((uint32_t)b & 0x1F) << 9)  // 5 bits
                 | (((uint32_t)c & 0x3F) << 14) // 7 bits
                 | (((uint32_t)d & 0x7FF) << 21); //11 bits
    *pkt = htonl(pkt_h);
}

void unpack_pkt(uint16_t *a, uint8_t *b, uint8_t *c, uint16_t *d, uint8_t *pkt){
    uint32_t pkt_h = ntohl(*pkt);
    (*a) = pkt_h & 0x1FF;
    (*b) = (pkt_h >> 9) & 0x1F;
    (*c) = (pkt_h >> 14) & 0x3F;
    (*d) = (pkt_h >> 21) & 0x7FF;
}

If so, what other measures can I take to ensure portability?

user555045 · Accepted Answer

Structs with bitfields are indeed essentially useless for this purpose, as their field order and even padding rules are not consistent.

shall I run into problems with the endianness of a and d separately?

The endianness of a and d doesn't matter, their byte-order is never used. a and d are not reinterpreted as raw bytes, only their integer values are used or assigned to, and in those cases endianness does not enter the picture.

There is an other problem though: uint8_t *pkt in combination with *pkt = htonl(pkt_h); means that only the least significant byte is saved (regardless of whether it is executed by a little endian or big endian machine, because this is not a reinterpretation, it's an implicit conversion). uint8_t *pkt is OK by itself, but then the resulting group of 4 bytes must be copied into the buffer it points to, it cannot be assigned all in one go. uint32_t *pkt would enable such a single-assignment to work without losing data, but that makes the function less convenient to use.

Similarly in unpack_pkt, only one byte of data is currently used.

When those issues are fixed, it should be good:

void pack_pkt(uint16_t a, uint8_t b, uint8_t c, uint16_t d, uint8_t *buffer){
    uint32_t pkt_h = ((uint32_t)a & 0x1FF)      // 9 bits
                 | (((uint32_t)b & 0x1F) << 9)  // 5 bits
                 | (((uint32_t)c & 0x3F) << 14) // 7 bits
                 | (((uint32_t)d & 0x7FF) << 21); //11 bits
    uint32_t pkt = htonl(pkt_h);
    memcpy(buffer, &pkt, sizeof(uint32_t));
}

void unpack_pkt(uint16_t *a, uint8_t *b, uint8_t *c, uint16_t *d, uint8_t *buffer){
    uint32_t pkt;
    memcpy(&pkt, buffer, sizeof(uint32_t));
    uint32_t pkt_h = ntohl(pkt);
    (*a) = pkt_h & 0x1FF;
    (*b) = (pkt_h >> 9) & 0x1F;
    (*c) = (pkt_h >> 14) & 0x3F;
    (*d) = (pkt_h >> 21) & 0x7FF;
}

An alternative that works without worrying about endianness at any point is manually deconstructing the uint32_t (rather than conditionally byte-swapping it with htonl and then reinterpreting it as raw bytes), for example:

void pack_pkt(uint16_t a, uint8_t b, uint8_t c, uint16_t d, uint8_t *pkt){
    uint32_t pkt_h = ((uint32_t)a & 0x1FF)      // 9 bits
                 | (((uint32_t)b & 0x1F) << 9)  // 5 bits
                 | (((uint32_t)c & 0x3F) << 14) // 7 bits
                 | (((uint32_t)d & 0x7FF) << 21); //11 bits
    // example serializing the bytes in big endian order, regardless of host endianness
    pkt[0] = pkt_h >> 24;
    pkt[1] = pkt_h >> 16;
    pkt[2] = pkt_h >> 8;
    pkt[3] = pkt_h;
}

The original approach isn't bad, this is just an alternative, something to consider. Since nothing is ever reinterpreted, endianness does not matter at all, which may increase confidence in the correctness of the code. Of course as a downside, it requires more code to get the same thing done. By the way, even though manually deconstructing the uint32_t and storing 4 separate bytes looks like a lot of work, GCC can compile it efficiently into a bswap and 32bit store. On the other hand Clang misses this opportunity and other compilers may as well, so this is not without its drawbacks.

Portable and Tight Bit Packing

Answers (2)

Related Questions