Reputation: 37

Why does this function count the number of set bits in an integer

I was asked the following question in an interview:

int foofoo(unsigned int u) {
    unsigned int foo = u;
    do{
        u = u/2;
        foo -= u;
    }while(u > 0);
    return foo;
}

I was asked to tell what does this function do and I was able to find that it counts the number of set bits in an unsigned int value, but I was not able to prove that, maybe someone can?

Upvotes: 3

Answers (3)

Ruslan Osmanov

Reputation: 21502

The Idea

It is known that an unsigned integer u can be represented as a sum of powers of two:

u = A * 2ª + ... + N * 2ⁿ,

where 0 <= a < ... < n are integers, and A, ..., N are either zeroes, or ones. If we remove the zero-product terms, the number of terms will be equal to the number of set bits (ones). For example, 1101_b = 2⁰ + 2² + 2³ consists of three terms (powers of two), and the number of set bits is also equal to three.

The idea is to find the number of non-zero terms in this representation.

Algorithm

If we express u as a sum of powers of two:

u = 2ª + ... + 2ⁿ

where 0 <= a < ... < n, then the integer division u = u / 2 can be expressed as:

u = ( 2ª + ... + 2ⁿ ) / 2

or, equivalently:

u = ( 2ª / 2 ) + ... + ( 2ⁿ / 2 )

(Since, generally, (x + y) / 2 = (x / 2) + (y / 2).)

The process is repeated while u is positive. So each term eventually becomes equal to one. In the next iteration it becomes equal to zero due to the rules of integer division: 1 / 2 = 0.

The resulting value (u) is subtracted from the original value stored in foo: foo -= u.

Thus, we eventually subtract everything from the original value foo, except the "ones divided by two" (1 / 2).

Obviously, the number of occurrences of 1 / 2 is equal to the number of non-zero terms in the original expression of u which is equal to the number of set bits (as we found out above). Hence, the value remaining ìn foo is a sum of "1 / 2" leftovers, i.e. the number of terms which is also the number of set bits.

Example

Let's visualize the process on example of u = 13.

Binary representation of the decimal number 13 is 1101. So the sum of powers of two in decimal notation is:

u = 2⁰ + 2² + 2³

First iteration:

u = u / 2
  = (2⁰ + 2² + 2³) / 2
  = (2⁰ / 2) + (2² / 2) + (2³ / 2)
  = 0        + 2        + 2²

We have one zero so far: 2⁰ / 2 = 1 / 2 = 0.

Second iteration:

u = u / 2
  = (2      + 2²) / 2
  = (2 / 2) + (2² / 2)
  = 1       + 2

No more zeroes in this iteration. Third iteration:

u = u / 2
  = (1      + 2) / 2
  = (1 / 2) + (2 / 2)
  = 0       + 1

We have got the second zero in this iteration: 1 / 2 = 0.

Fourth iteration gives the third zero:

u = u / 2
  = 1 / 2 = 0.

The number of zeroes is equal to the number of set bits, i.e. 3.

Upvotes: 1

n. m. could be an AI

Reputation: 120031

Suppose when the input is N, the output is M. Then when the input is 2*N the output is still M, and when the input is 2*N+1 the output must be M+1 (both statements follow from the analysis of the first iteration of the loop which reduces the input to N).

Upvotes: 0

4386427

Reputation: 44329

I was able to find that it counts the number of set bits in an unsigned int value, but I was not able to prove that, maybe someone can?

Look at a single bit minside U. That bit contributes to the value of U like:

U = ...... + U_m 2^m + ..... (where the .... represents other bits).

In each loop the bit is divived by 2 and then subtracted from U. When the loop is complete, this looks like the line below for bit m:

U_m 2^m - U_m 2^(m-1) - U_m 2^(m-2) - .... - U_m 2^0

This can be rewritten like:

U_m (2^m - (2^(m-1) + 2^(m-2) + .... + 2^0))

U_m (2^m - (2^m -1))

U_m (2^m - 2^m + 1)

U_m

So from this we can see, that bit m contributes to the final result with its bit-value (i.e. 0 or 1). This, of cause, holds true for all bit in U. Therefore:

foo = U_n-1 + U_n-2 + ... + U_m + ... + U₁ + U₀

Consequently, the result will be equal to the number of bits set in U.

p.s. I tried to use superscript to make the formulas look better. But I couldn't make <sup> work. Feel free to edit the formulas if you know how.