abdolahS
abdolahS

Reputation: 683

Best way to convert 8 boolean to one byte?

I want to save 8 boolean to one byte and then save it to a file(this work must be done for a very large data), I've used the following code but I'm not sure it is the best one(in terms of speed and space):

int bits[]={1,0,0,0,0,1,1,1};
char a='\0';
for (int i=0;i<8;i++){
  a=a<<1;
  a+=bits[i]
}
//and then save "a"

can anyone give me a better code(more speed) ?

Upvotes: 1

Views: 1318

Answers (4)

jbilander
jbilander

Reputation: 651

Late reply but I had to do this myself and here is how I did it, no loop required.

int bits[] = {1, 0, 0, 0, 0, 1, 1, 1};

uint8_t result = bits[0] << 7 | bits[1] << 6 | bits[2] << 5 | bits[3] << 4 | 
                 bits[4] << 3 | bits[5] << 2 | bits[6] << 1 | bits[7];

You can verify the result is correct by doing a printf output of the result in binary format like this:

char buffer[1];
itoa(result, buffer, 2);
printf("binary: %s\n", buffer);

Will print out:

binary: 10000111

Upvotes: 0

user555045
user555045

Reputation: 64904

If you don't mind using SSE intrinsics, then _mm_movemask_epi8 is an excellent fit. It uses 16 bytes, but you can just set the others to zero.

For example (not tested)

__m128i values = _mm_loadl_epi64((__m128i*)array);
__m128i order = _mm_set_epi8(0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
                             0, 1, 2, 3, 4, 5, 6, 7);
values = _mm_shuffle_epi8(values, order);
int result = _mm_movemask_epi8(_mm_slli_epi32(values, 7));

This assumes the array is an array of chars. If you can't make that happen, it takes some more loads and packs and it becomes a bit annoying.

Upvotes: 4

Cheers and hth. - Alf
Cheers and hth. - Alf

Reputation: 145279

Regarding

can anyone give me a better code(more speed)

you should measure. Most of the impact on the speed of serializing to file is i/o speed. What you do with the bits will likely have an unmeasurably small impact, but if it has any impact then that is likely mostly influenced by your original representation of the sequence of booleans.


Now regarding the given code

int bits[]={1,0,0,0,0,1,1,1};
char a='\0';
for (int i=0;i<8;i++){
a=a<<1;
a+=bits[i]
}
//and then save "a"
  • Use unsigned char as byte type, just on principle.
  • Use bitlevel OR, the | operator, again just on principle.
  • Use prefix ++, yes, also that just on principle.

The “on principle” for the first point is because in practice your code will not run on any machine with sign-and-magnitude or one's complement representation of signed integers, where char is signed. But I think it's generally a good idea to express in the code exactly what one intends doing, instead of rewriting it as something slightly different. And the intention here is to deal with bits, an unsigned byte.

The “on principle” for the bitlevel OR is because for this particular case there's no practical difference between bitlevel OR and addition. But in general it's a good idea to write in code what one means to express. And then it's no good to write a bitlevel OR as an addition: it might even trip you up, bite you in the a**, in some other context.

The “on principle” for the prefix ++ is because in practice the compiler will optimize prefix and postfix ++ for a basic type, when the expression result isn't used, to the very same machine code. But again it's generally better to write what one intends to express. Asking for an original value (the postfix ++) is just misleading a reader of the code when you're not ever using that original value – and as with the bitlevel OR expressed as addition, the pure increment expressed as postfix ++ might trip you up, bite you in the a**, in some other context, e.g. with iterators.


The general approach of explicitly coding up shifting and ORing appears to me to be fine because std::bitset does not support initialization from a sequence of booleans (only initialization from a text string), so it doesn't save you any work. But generally it's a good idea to check the standard library, whether it supports whatever one wants to do. It might even happen that someone else chimes in here with some standard library based approach that I didn't think of! ;-)

Upvotes: 2

Marcus M&#252;ller
Marcus M&#252;ller

Reputation: 36346

Replace the += operator by |=, which is the bit-wise operation (and actually what you want to do here). Use unsigned char for your truth values, if possible.

Unless you want to hand-unroll your loops and/or use SIMD intrinsics, that would be the most compiler-optimizable solution, I guess.

there's another trick: structs can have bit offsets, and you can use union on them to misuse them as ints.

By the way: your code is buggy. You shift first, then write; you use addition, but a signed char, which will definitely go wrong for the 7th and 8th bits (given you erroneously shift too early; if you did that properly, only the 8th bit will cause hazard).

Upvotes: -1

Related Questions