Understanding a function that uses binary operands to count the number of 1s in a binary number

Question

Recently I've run across this question that asks you to return the number of consecutive ones in a binary string, going from left to right (so 11101001 = 3, 011111 = 0, 111101 = 4, etc). The catch is to do it using only binary operands, and without any sort of loops.

After looking around online for a bit (thanks Google), I found this solution someone else came up with, and would like better help understanding it.

int leftBitCount(int x) {
    int v = x;
    int r;  // store our result here
    int shift;
    int full = !(~x); // we must add one if we have 0xffffffff

    // Check the top 16 bits and add them to our result if they exist
    r     = !(~(v>>16)) << 4;
    v   <<= r;
    // check the remaining 8 bits
    shift = !(~(v >> 24)) << 3;
    v   <<= shift;
    r    |= shift;
    // remaining 4 bits
    shift = !(~(v>>28)) << 2;
    v   <<= shift;
    r    |= shift;
    // remaining 2 bits
    shift = !(~(v >> 30)) << 1;
    v   <<= shift;
    r    |= shift;
    // remaining 1 bits
    r    ^= 1&((v>>31));

    // remember to add one if we have 32 on bits
    return r + full;
}

From what I can tell, the function apparently checks the first 16 bits of the 32 bit integer, then goes to next 8 depending on whether they were all 1s, then the next 4 depending whether those were all 1s, then 2, then 1, etc.

Unfortunately, I'm kind of lost as to how exactly the code accomplishes this, and would like some help understanding.

For example, here:

r     = !(~(v>>16)) << 4;
v   <<= r;

I can see that v is being shifted and negated, but am at a loss as to how this helps solve anything.

Joe Farrell · Accepted Answer

r = !(~(v>>16)) << 4;

Here's what's happening on this line:

v >> 16 shifts the value of v (which at this time is the value of x) 16 places to the right. The bottom 16 bits of this expression are the top 16 bits of v, and the top 16 bits of this expression, interestingly, are implementation-defined. The C compilers I've worked with have performed a right shift of a signed value as an arithmetic shift, and the code you posted seems to be counting on this fact, but it's not necessarily the case. See the answers to this question for details.
The ~ operator performs a bitwise complement of the value produced in step 1. The important observation here is that if and only if (a) the top 16 bits of v were all 1s, and (b) the C compiler being used performs an arithmetic shift, then all the bits of the expression in step 1 will be 1s, which means that ~(v>>16) will be zero.
The ! operator performs a logical negation of the value produced in step 2, which means that !(~(v>>16)) will be 1 if and only if the value in step 2 is zero (meaning that the top 16 bits of v are all 1s). Otherwise, this value will be zero.
Finally, the << 4 is just a multiplication by 16. So r is assigned the value 16 if the top 16 bits of x were set to 1, or the value zero if any of the top 16 bits were 0s (or if the C compiler being used performs a logical right shift).

Next, this line:

v <<= r;

If the top 16 bits of v are all 1s—meaning that the value of r is 16—then we don't need to examine those bits any further, so this line shifts v to the left by 16 bits, which essentially discards those top 16 bits that we've already examined, and places the eight next-most-significant bits in the topmost positions, so we can examine those next.

If, on the other hand, the top 16 bits of v are not all 1s—meaning that the value of r is 0—then the bottom 16 bits of v become irrelevant, because whatever the number of leading 1 bits in v is, we know that it must be less than 16. So in that case, v <<= r leaves the value of v untouched, and we proceed to the next step by examining the eight original topmost bits in v.

The rest of the algorithm basically repeats these steps, except that the number of bits being examined falls by half in each step. So this bit of code checks the top 8 bits:

shift = !(~(v >> 24)) << 3;
v <<= shift;
r |= shift;

The only thing different here is that r is being augmented with a logical OR rather than being assigned directly as it was in the 16-bit step. This is functionally equivalent to addition in this case, because we know that the value of r was previously 0 or 16, and the value of shift is likewise either 0 or 8, and none of those values' 1 bits occur in corresponding positions. Then this bit of code does the same with the top 4 bits:

shift = !(~(v>>28)) << 2;
v <<= shift;
r |= shift;

Then the top 2:

shift = !(~(v >> 30)) << 1;
v <<= shift;
r |= shift;

And finally, the top 1:

r ^= 1&((v>>31));

The number of bits thus tested is 16 + 8 + 4 + 2 + 1 = 31, which is obviously one shy of the total number of bits in the original value, and thus not sufficient in the case where all 32 bits are 1s. Assuming that all 32 bits are 1s, here's how the original value of x will be shifted in each step:

16-bit step: 123456789ABCDEFGHIJKLMNOPQRSTUVW
 8-bit step: HIJKLMNOPQRSTUVW----------------
 4-bit step: PQRSTUVW------------------------
 2-bit step: TUVW----------------------------
 1-bit step: VW------------------------------

The one that's never considered by the code seen thus far is the W bit, the least significant bit of the original value. That bit will be part of the result if and only if all bits to the left of it are 1s, meaning that all bits in the value are 1s, and this is the purpose of the check performed at the top of the code:

 int full = !(~x);

This is the same combination of bitwise complement and logical negation we saw earlier; ~x will be zero if and only if all of the bits in x are 1s, and so !(~x) will be 1 in that case and zero in any other case.

Understanding a function that uses binary operands to count the number of 1s in a binary number

Answers (2)

Related Questions