xcel
xcel

Reputation: 377

How to create a byte out of 8 bool values (and vice versa)?

I have 8 bool variables, and I want to "merge" them into a byte.

Is there an easy/preferred method to do this?

How about the other way around, decoding a byte into 8 separate boolean values?

I come in assuming it's not an unreasonable question, but since I couldn't find relevant documentation via Google, it's probably another one of those "nonono all your intuition is wrong" cases.

Upvotes: 30

Views: 33968

Answers (9)

RARE Kpop Manifesto
RARE Kpop Manifesto

Reputation: 2915

@Fabian @111111 -

If you want 8 split out bools with less hard-coding, maybe...

bool a,b,c,d,e,f,g,h;      # assuming you want [a] at MSB and 
                           #                   [h] at LSB

char y = a << 7 | b<< 6 | c << 5 | d << 4 | e << 3 | f << 2 | g << 1 | h
char y = h|(g|(f|(e|(d|(c|(b|a << 1)<< 1)<< 1)<< 1)<< 1)<< 1)<< 1

... instead of 7 different shifting amounts, a nested form allow re-using the same shifting constant over and over again,

with the added benefit of placing all the bools/digits together on one side, and all the shifting amounts on the other side

The arithmetic equivalent expressions would be (they're all identical) -

char y = h + (g + (f + (e + (d + (c + (b + a * 2) * 2) * 2) * 2) * 2) * 2) * 2

char y = 2 * (2 * (2 * (2 * (2 * (2 * (2 * a + b) + c) + d) + e) + f) + g) + h

char y = 2 * ( 2 * ( 2 * ( 2 * ( 2 * ( 2 * ( 2 * (\
            a ) + b ) + c ) + d ) + e ) + f ) + g ) + h

Personally I like the last form the most because it's a full-blown base-conversion routine without requiring the audience to even have any understanding what base and exponents are, or what shifting entails as long as they understand addition and multiplication.

Upvotes: 0

phuclv
phuclv

Reputation: 42012

The cool way (using the multiplication technique)

inline uint8_t pack8bools(bool* a)
{
    uint64_t t;
    memcpy(&t, a, sizeof t);         //  strict-aliasing & alignment safe load
    return 0x8040201008040201ULL*t >> 56;
       // bit order: a[0]<<7 | a[1]<<6 | ... | a[7]<<0  on little-endian
       // for a[0] => LSB, use 0x0102040810204080ULL    on little-endian
}

void unpack8bools(uint8_t b, bool* a)
{
       // on little-endian,  a[0] = (b>>7) & 1  like printing order
    auto MAGIC = 0x8040201008040201ULL;  // for opposite order, byte-reverse this
    auto MASK  = 0x8080808080808080ULL;
    uint64_t t = ((MAGIC*b) & MASK) >> 7;
    memcpy(a, &t, sizeof t);    // store 8 bytes without UB
}

Assuming sizeof(bool) == 1

To portably do LSB <-> a[0] (like the pext/pdep version below) instead of using the opposite of host endianness, use htole64(0x0102040810204080ULL) as the magic multiplier in both versions. (htole64 is from BSD / GNU <endian.h>). That arranges the multiplier bytes to match little-endian order for the bool array. htobe64 with the same constant gives the other order, MSB-first like you'd use for printing a number in base 2.

You may want to make sure that the bool array is 8-byte aligned (alignas(8)) for performance, and that the compiler knows this. memcpy is always safe for any alignment, but on ISAs that require alignment, a compiler can only inline memcpy as a single load or store instruction if it knows the pointer is sufficiently aligned. *(uint64_t*)a would promise alignment, but also violate the strict-aliasing rule. Even on ISAs that allow unaligned loads, they can be faster when naturally aligned. But the compiler can still inline memcpy without seeing that guarantee at compile time.


How they work

Suppose we have 8 bools b[0] to b[7] whose least significant bits are named a-h respectively that we want to pack into a single byte. Treating those 8 consecutive bools as one 64-bit word and load them we'll get the bits in reversed order in a little-endian machine. Now we'll do a multiplication (here dots are zero bits)

  |  b7  ||  b6  ||  b4  ||  b4  ||  b3  ||  b2  ||  b1  ||  b0  |
  .......h.......g.......f.......e.......d.......c.......b.......a
× 1000000001000000001000000001000000001000000001000000001000000001
  ────────────────────────────────────────────────────────────────
  ↑......h.↑.....g..↑....f...↑...e....↑..d.....↑.c......↑b.......a
  ↑.....g..↑....f...↑...e....↑..d.....↑.c......↑b.......a
  ↑....f...↑...e....↑..d.....↑.c......↑b.......a
+ ↑...e....↑..d.....↑.c......↑b.......a
  ↑..d.....↑.c......↑b.......a
  ↑.c......↑b.......a
  ↑b.......a
  a       
  ────────────────────────────────────────────────────────────────
= abcdefghxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

The arrows are added so it's easier to see the position of the set bits in the magic number. At this point 8 least significant bits has been put in the top byte, we'll just need to mask the remaining bits out

So the magic number for packing would be 0b1000000001000000001000000001000000001000000001000000001000000001 or 0x8040201008040201. If you're on a big endian machine you'll need to use the magic number 0x0102040810204080 which is calculated in a similar manner

For unpacking we can do a similar multiplication

  |  b7  ||  b6  ||  b4  ||  b4  ||  b3  ||  b2  ||  b1  ||  b0  |
                                                          abcdefgh
× 1000000001000000001000000001000000001000000001000000001000000001
  ────────────────────────────────────────────────────────────────
= h0abcdefgh0abcdefgh0abcdefgh0abcdefgh0abcdefgh0abcdefgh0abcdefgh
& 1000000010000000100000001000000010000000100000001000000010000000
  ────────────────────────────────────────────────────────────────
= h0000000g0000000f0000000e0000000d0000000c0000000b0000000a0000000

After multiplying we have the needed bits at the most significant positions, so we need to mask out irrelevant bits and shift the remaining ones to the least significant positions. The output will be the bytes contain a to h in little endian.


The efficient way

On newer x86 CPUs with BMI2 there are PEXT and PDEP instructions for this purpose. The pack8bools function above can be replaced with

_pext_u64(*((uint64_t*)a), 0x0101010101010101ULL);

And the unpack8bools function can be implemented as

_pdep_u64(b, 0x0101010101010101ULL);

(This maps LSB -> LSB, like a 0x0102040810204080ULL multiplier constant, opposite of 0x8040201008040201ULL. x86 is little-endian: a[0] = (b>>0) & 1; after memcpy.)

Unfortunately those instructions are very slow on AMD before Zen 3 so you may need to compare with the multiplication method above to see which is better

The other fast way is SSE2

x86 SIMD has an operation that takes the high bit of every byte (or float or double) in a vector register, and gives it to you as an integer. The instruction for bytes is pmovmskb. This can of course do 16 bytes at a time with the same number of instructions, so it gets better than the multiply trick if you have lots of this to do.

#include <immintrin.h>

inline uint8_t pack8bools_SSE2(const bool* a)
{
    __m128i v = _mm_loadl_epi64( (const __m128i*)a );  // 8-byte load, despite the pointer type.
    // __m128 v = _mm_cvtsi64_si128( uint64 );  // alternative if you already have an 8-byte integer
    v = _mm_slli_epi32(v, 7);   // low bit of each byte becomes the highest
    return _mm_movemask_epi8(v);
}

There isn't a single instruction to unpack until AVX-512, which has mask-to-vector instructions. It is doable with SIMD, but likely not as efficiently as the multiply trick. See Convert 16 bits mask to 16 bytes mask and more generally is there an inverse instruction to the movemask instruction in intel avx2? for unpacking bitmaps to other element sizes.

How to efficiently convert an 8-bit bitmap to array of 0/1 integers with x86 SIMD has some answers specifically for 8-bits -> 8-bytes, but if you can't do 16 bits at a time for that direction, the multiply trick is probably better, and pext certainly is (except on CPUs where it's disastrously slow, like AMD before Zen 3).

Upvotes: 20

rodrigo
rodrigo

Reputation: 98526

The hard way:

unsigned char ToByte(bool b[8])
{
    unsigned char c = 0;
    for (int i=0; i < 8; ++i)
        if (b[i])
            c |= 1 << i;
    return c;
}

And:

void FromByte(unsigned char c, bool b[8])
{
    for (int i=0; i < 8; ++i)
        b[i] = (c & (1<<i)) != 0;
}

Or the cool way:

struct Bits
{
    unsigned b0:1, b1:1, b2:1, b3:1, b4:1, b5:1, b6:1, b7:1;
};
union CBits
{
    Bits bits;
    unsigned char byte;
};

Then you can assign to one member of the union and read from another. But note that the order of the bits in Bits is implementation defined.

Note that reading one union member after writing another is well-defined in ISO C99, and as an extension in several major C++ implementations (including MSVC and GNU-compatible C++ compilers), but is Undefined Behaviour in ISO C++. memcpy or C++20 std::bit_cast are the safe ways to type-pun in portable C++.

(Also, the bit-order of bitfields within a char is implementation defined, as is possible padding between bitfield members.)

Upvotes: 28

user406009
user406009

Reputation:

You might want to look into std::bitset. It allows you to compactly store booleans as bits, with all of the operators you would expect.

No point fooling around with bit-flipping and whatnot when you can abstract away.

Upvotes: 12

iBug
iBug

Reputation: 37317

I'd like to note that type punning through unions is UB in C++ (as rodrigo does in his answer. The safest way to do that is memcpy()

struct Bits
{
    unsigned b0:1, b1:1, b2:1, b3:1, b4:1, b5:1, b6:1, b7:1;
};

unsigned char toByte(Bits b){
    unsigned char ret;
    memcpy(&ret, &b, 1);
    return ret;
}

As others have said, the compiler is smart enough to optimize out memcpy().

BTW, this is the way that Boost does type punning.

Upvotes: 2

You would use the bitwise shift operation and casting to archive it. a function could work like this:

unsigned char toByte(bool *bools)
{
    unsigned char byte = \0;
    for(int i = 0; i < 8; ++i) byte |= ((unsigned char) bools[i]) << i;
    return byte;
}

Thanks Christian Rau for the correction s!

Upvotes: 0

ScarletAmaranth
ScarletAmaranth

Reputation: 5101

There is no way to pack 8 bool variables into one byte. There is a way packing 8 logical true/false states in a single byte using Bitmasking.

Upvotes: 1

111111
111111

Reputation: 16168

bool a,b,c,d,e,f,g,h;
//do stuff
char y= a<<7 | b<<6 | c<<5 | d<<4 | e <<3 | f<<2 | g<<1 | h;//merge

although you are probably better off using a bitset

http://www.cplusplus.com/reference/stl/bitset/bitset/

Upvotes: 2

Jeremy Friesner
Jeremy Friesner

Reputation: 73379

#include <stdint.h>   // to get the uint8_t type

uint8_t GetByteFromBools(const bool eightBools[8])
{
   uint8_t ret = 0;
   for (int i=0; i<8; i++) if (eightBools[i] == true) ret |= (1<<i);
   return ret;
}

void DecodeByteIntoEightBools(uint8_t theByte, bool eightBools[8])
{
   for (int i=0; i<8; i++) eightBools[i] = ((theByte & (1<<i)) != 0);
}

Upvotes: 6

Related Questions