General bit counting

Question

I'm aware of is this fairly nice solution for counting the number set bits in an integer:

CountBits(n)    
    count = 0
    while n > 0
        n ← n & (n - 1)
        count ← count + 1

Is there a similarly elegant solution of counting how many non-overlapping k bit groups are non-zero. I currently have:

CountGroups(n, k)
    count = 0
    mask = (2 << k) - 1
    while n > 0
        if n & mask ≠ 0
            count ← count + 1
        n >> k

which is linear in the position of the left most group. In the latter algorithm, if only the first and last groups are set, I have to visit and check all the groups in between, while the former only performs two operations regardless. It's not a bottleneck or anything, just curious if there is a better way.

Aki Suihkonen · Accepted Answer

One can calculate groups of k set bits in parallel by splitting the word to odd and even sequences. Adding one to each group, carries to the next empty position (marked with ...) if and only if all bits in the group were set.

The original question is about any bit is set in a group, but this can be transformed by complementing n, adding one, and complementing the result. (Now result == 1 if and only if the original group was all zeroes.)

 // ...xxx...xxx...xxx   even sequence
 // yyy...yyy...yyy...   odd sequence

 uint64_t a = ~n & even_mask;         // inverse of n
 uint64_t b = (~n >> k) & even_mask;  // inverse of n
 a += 0010101;  // this is octal, where a "one" is added to each group
 b += 0010101;  // same for odd sequence

 //  a = 001xxx000yyy001xxx  -- e.g. first and last group were all ones
 //  b = 000yyy000yyy001yyy  -- e.g. last group of 'y' was all one

 a &= ~even_mask;   // remove the xxx yyy parts, leaving only carry
 b &= ~even_mask;
 a |= (b << 1);
 return bitcount(a ^ 03030303);   // return total number of carry bits

Bitcount can be done in parallel, with special instruction if available (__popcount) or with the Kerningham Richie -method (n & (n-1)) as described.

This only illustrates k==3, but can be extended for arbitrary k, even though the return of interest diminishes when k grows larger.

Depending on k there might be better algorithm to just derive a boolean expression for the condition.

In this case for each group the condition is n(i) | n(i+1) | ... n(i+k-1), which can be evaluated in parallel and/or using a folding technique, which is especially powerful when k is a power of 2

//  n = aabb ccdd eeff gghh iijj   -- original sequence, k = 4
//      0011 0011 0011 0011 0011   -- mask
n = (n | (n >> 2)) & 0x33333333;   // fold to aa|bb, cc|dd, etc.
n = (n | (n >> 1)) & 0x11111111;   // fold boolean expression as 0th bit position in each group

return bitcount(n);

General bit counting

Answers (1)

Related Questions