LiamLony
LiamLony

Reputation: 51

Counting bits of ones in Byte by time Complexity O(1) C++ code

I've searched an algorithm that counts the number of ones in Byte by time complexity of O(1) and what I found in google:

// C++ implementation of the approach  
#include <bits/stdc++.h> 
using namespace std; 
  
int BitsSetTable256[256]; 
  
// Function to initialise the lookup table  
void initialize()  
{  
  
    // To initially generate the  
    // table algorithmically  
    BitsSetTable256[0] = 0;  
    for (int i = 0; i < 256; i++) 
    {  
        BitsSetTable256[i] = (i & 1) +  
        BitsSetTable256[i / 2];  
    }  
}  
  
// Function to return the count  
// of set bits in n  
int countSetBits(int n)  
{  
    return (BitsSetTable256[n & 0xff] +  
            BitsSetTable256[(n >> 8) & 0xff] +  
            BitsSetTable256[(n >> 16) & 0xff] +  
            BitsSetTable256[n >> 24]);  
}  
  
// Driver code  
int main()  
{  
    // Initialise the lookup table  
    initialize();  
    int n = 9;  
    cout << countSetBits(n); 
}  
  

I understand what I need 256 size of the array (in other words size of the look up table) for indexing from 0 to 255 which they are all the decimals value that Byte represents !

but in the function initialize I didn't understand the terms inside the for loop: BitsSetTable256[i] = (i & 1) + BitsSetTable256[i / 2];
Why Im doing that?! I didn't understand what's the purpose of this row code inside the for loop.

In addition , in the function countSetBits , this function returns:

 return (BitsSetTable256[n & 0xff] +  
            BitsSetTable256[(n >> 8) & 0xff] +  
            BitsSetTable256[(n >> 16) & 0xff] +  
            BitsSetTable256[n >> 24]); 

I didn't understand at all what Im doing and bitwise with 0xff and why Im doing right shift .. may please anyone explain to me the concept?! I didn't understand at all why in function countSetBits at BitsSetTable256[n >> 24] we didn't do and wise by 0xff ?

I understand why I need the lookup table with size 2^8 , but the other code rows that I mentioned above didn't understand, could anyone please explain them to me in simple words? and what's purpose for counting the number of ones in Byte?

thanks alot guys!

Upvotes: 5

Views: 741

Answers (3)

Scheff&#39;s Cat
Scheff&#39;s Cat

Reputation: 20141

Concerning the first part of question:

// Function to initialise the lookup table  
void initialize()  
{  
  
    // To initially generate the  
    // table algorithmically  
    BitsSetTable256[0] = 0;  
    for (int i = 0; i < 256; i++) 
    {  
        BitsSetTable256[i] = (i & 1) +  
        BitsSetTable256[i / 2];  
    }  
}  

This is a neat kind of recursion. (Please, note I don't mean "recursive function" but recursion in a more mathematical sense.)

The seed is BitsSetTable256[0] = 0;

Then every element is initialized using the (already existing) result for i / 2 and adds 1 or 0 for this. Thereby,

  • 1 is added if the last bit of index i is 1
  • 0 is added if the last bit of index i is 0.

To get the value of last bit of i, i & 1 is the usual C/C++ bit mask trick.

Why is the result of BitsSetTable256[i / 2] a value to built upon? The result of BitsSetTable256[i / 2] is the number of all bits of i the last one excluded.

Please, note that i / 2 and i >> 1 (the value (or bits) shifted to right by 1 whereby the least/last bit is dropped) are equivalent expressions (for positive numbers in the resp. range – edge cases excluded).


Concerning the other part of the question:

    return (BitsSetTable256[n & 0xff] +  
            BitsSetTable256[(n >> 8) & 0xff] +  
            BitsSetTable256[(n >> 16) & 0xff] +  
            BitsSetTable256[n >> 24]);
  • n & 0xff masks out the upper bits isolating the lower 8 bits.
  • (n >> 8) & 0xff shifts the value of n 8 bits to right (whereby the 8 least bits are dropped) and then again masks out the upper bits isolating the lower 8 bits.
  • (n >> 16) & 0xff shifts the value of n 16 bits to right (whereby the 16 least bits are dropped) and then again masks out the upper bits isolating the lower 8 bits.
  • (n >> 24) & 0xff shifts the value of n 24 bits to right (whereby the 24 least bits are dropped) which should make effectively the upper 8 bits the lower 8 bits.

Assuming that int and unsigned have usually 32 bits on nowadays common platforms this covers all bits of n.

Please, note that the right shift of a negative value is implementation-defined.

(I recalled Bitwise shift operators to be sure.)

So, a right-shift of a negative value may fill all upper bits with 1s.
That can break BitsSetTable256[n >> 24] resulting in (n >> 24) > 256
and hence BitsSetTable256[n >> 24] an out of bound access.

The better solution would've been:

    return (BitsSetTable256[n & 0xff] +  
            BitsSetTable256[(n >> 8) & 0xff] +  
            BitsSetTable256[(n >> 16) & 0xff] +  
            BitsSetTable256[(n >> 24) & 0xff]);

Upvotes: 10

Codor
Codor

Reputation: 17605

The code in countSetBits takes an int as an argument; apparently 32 bits are assumed. The implementation there is extracting four single bytes from n by shifting and masking; for these four separated bytes, the lookup is used and the number of bits per byte there are added to yield the result.

The initialization of the lookup table is a bit more tricky and can be seen as a form of dynamic programming. The entries are filled in increasing index of the argument. The first expression masks out the least significant bit and counts it; the second expression halves the argument (which could be also done by shifting). The resulting argument is smaller; it is then correctly assumed that the necessary value for the smaller argument is already available in the lookup table.

For the access to the lookup table, consider the following example:

input value (contains 5 ones):
01010000 00000010 00000100 00010000

input value, shifting is not necessary
masked with 0xff (11111111)
00000000 00000000 00000000 00010000     (contains 1 one)

input value shifted by 8
00000000 01010000 00000010 00000100
and masked with 0xff (11111111)
00000000 00000000 00000000 00000100     (contains 1 one)

input value shifted by 16
00000000 00000000 01010000 00000010
and masked with 0xff (11111111)
00000000 00000000 00000000 00000010     (contains 1 one)

input value shifted by 24,
masking is not necessary
00000000 00000000 00000000 01010000     (contains 2 ones)

The extracted values have only the lowermost 8 bits set, which means that the corresponding entries are available in the lookup table. The entries from the lookuptable are added. The underlying idea is that the number of ones in in the argument can be calculated byte-wise (in fact, any partition in bitstrings would be suitable).

Upvotes: 2

Lundin
Lundin

Reputation: 213678

BitsSetTable256[0] = 0;
...
BitsSetTable256[i] = (i & 1) +  
     BitsSetTable256[i / 2];  

The above code seeds the look-up table where each index contains the number of ones for the number used as index and works as:

  • (i & 1) gives 1 for odd numbers, otherwise 0.
  • An even number will have as many binary 1 as that number divided by 2.
  • An odd number will have one more binary 1 than that number divided by 2.

Examples:

  • if i==8 (1000b) then (i & 1) + BitsSetTable256[i / 2] ->
    0 + BitsSetTable256[8 / 2] = 0 + index 4 (0100b) = 0 + 1 .
  • if i==7 (0111b) then 1 + BitsSetTable256[7 / 2] = 1 + BitsSetTable256[3] = 1 + index 3 (0011b) = 1 + 2.

If you want some formal mathematical proof why this is so, then I'm not the right person to ask, I'd poke one of the math sites for that.


As for the shift part, it's just the normal way of splitting up a 32 bit value in 4x8, portably without care about endianess (any other method to do that is highly questionable). If we un-sloppify the code, we get this:

BitsSetTable256[(n >>  0) & 0xFFu] +  
BitsSetTable256[(n >>  8) & 0xFFu] +  
BitsSetTable256[(n >> 16) & 0xFFu] +  
BitsSetTable256[(n >> 24) & 0xFFu] ;  

Each byte is shifted into the LS byte position, then masked out with a & 0xFFu byte mask.

Using bit shifts on int is however code smell and potentially buggy. To avoid poorly-defined behavior, you need to change the function to this:

#include <stdint.h>

uint32_t countSetBits (uint32_t n);

Upvotes: 3

Related Questions