Reputation: 5735
I know how to find out how many bits are on in a given number (or how many elements are true in a boolean arra), using a mask and bitwise operators, going over all bits checking if they are on. Assuming the number is of arbitrary length, the algorithm runs in O(n) time, where n is the number of bits in the number. Is there an asymptotically better algorithm? I don't think that's possible, but how can I formally prove it?
Upvotes: 16
Views: 5865
Reputation: 206
I think the type of formality you're looking for is an "adversarial proof".
Suppose one has an algorithm A that runs faster than O(n). Then for sufficiently large n, A must not look at some bit i. We then claim that A must be incorrect. An "adversary" will give A two strings s1 and s2 that are identical except for opposite values of bit i. The algorithm A will return the same value for s1 and s2, so the adversary has "tricked" A into giving the wrong answer. So no correct algorithm A running in o(n) time exists.
Upvotes: 1
Reputation: 112356
Okay, there seems to be some confusion here about order statistics, asymptotic notation, "big O".
It is correct that Brian Kernighan's algorithm is better in terms of number of operations. It is, however, not correct that it is asymptotically better.
This can be seen from the definition of big-O.
Recall that by definition a function is O(f(n)) when there exists a function g(n) such that f(n) ≤ k g(n) when n grows sufficiently large.
Now, let's define w to be the number of bits set in the word, and further note that the run time for a single word, as has been observed above, is a function of the number of bits set. Call that function c(w). We know that there's a fixed word width, call it ww; clearly for any word, 0 ≤ c(w) ≤ ww, and of course, worst case is that c(w) = c(ww). So, the run time of this algorithm is, at worst, n c(ww).
Thus, for n, the run time is ≤ n c(ww); that is, n ≤ n c(ww), and thus by definition, this algorithm has an asymptotic worst-case run time of O(n).
Upvotes: 0
Reputation: 11522
The fastest way to do this calculation is with a table array edx[bl] where the bl register contains a byte value. If the number is a single byte then the answer is one instruction:
mov eax, [edx:bl]
If the number has many bytes in it (say an array pointed to by ebp), then you loop through the bytes (where ecx is the number of bytes in the array containing the number):
sub ecx, 1
mov eax, 0
DoNextByte:
mov bl, [ebp:ecx]
add eax, [edx:bl]
test ecx, ecx
jz Done:
sub ecx, 1
jmp DoNextByte:
Done:
; answer is in eax
This is the absolute fastest way to do this and will be faster than any mathematical computation. Note that the shift instructions in Art's solution are very CPU expensive.
The problem with Kernighan's solution is that even when hand-coded in assembly it is slower than my algorithm. If it is compiled C it will probably generate a lot of memory accesses that will slow it down even beyond the larger number of clock cycles it requires.
Note that if the byte-to-count mapping is inlined right next to this instruction then the whole data table will be in the CPU cache so it will be really fast. In this case, no C program will even come close (think 20x slower or more).
Upvotes: 3
Reputation: 4622
Well, you can also use a lookup table holding the #bits for each byte and then divide the number into bytes, adding up the lookup values.
It will be still O(number of bits) but with a small factor.
Upvotes: 1
Reputation: 361565
Bit Twiddling Hacks presents a number of methods, including this one:
Counting bits set, Brian Kernighan's way
unsigned int v; // count the number of bits set in v unsigned int c; // c accumulates the total bits set in v for (c = 0; v; c++) { v &= v - 1; // clear the least significant bit set }
Brian Kernighan's method goes through as many iterations as there are set bits. So if we have a 32-bit word with only the high bit set, then it will only go once through the loop.
Examples of the algorithm in action:
128 & 127 == 0 10000000 & 01111111 == 00000000
177 & 176 == 176 10110001 & 10110000 == 10110000
176 & 175 == 160 10110000 & 10101111 == 10100000
160 & 159 == 128 10100000 & 10011111 == 10000000
128 & 127 == 0 10000000 & 01111111 == 00000000
255 & 254 == 254 11111111 & 11111110 == 11111110
254 & 253 == 252 11111110 & 11111101 == 11111100
252 & 251 == 248 11111100 & 11111011 == 11111000
248 & 247 == 240 11111000 & 11110111 == 11110000
240 & 239 == 224 11110000 & 11101111 == 11100000
224 & 223 == 192 11100000 & 11011111 == 11000000
192 & 191 == 128 11000000 & 10111111 == 10000000
128 & 127 == 0 10000000 & 01111111 == 00000000
As for the language agnostic question of algorithmic complexity, it is not possible to do better than O(n) where n is the number of bits. Any algorithm must examine all of the bits in a number.
What's tricky about this is when you aren't careful about the definition of n and let n be "the number of bit shifting/masking instructions" or some such. If n is the number of bits then even a simple bit mask (&
) is already an O(n) operation.
So, can this be done in better than O(n) bit tests? No.
Can it be done in fewer than O(n) add/shift/mask operations? Yes.
Upvotes: 24
Reputation: 20392
I always use this:
int
count_bits(uint32_t v)
{
v = v - ((v >> 1) & 0x55555555);
v = (v & 0x33333333) + ((v >> 2) & 0x33333333);
return ((v + (v >> 4) & 0xf0f0f0f) * 0x1010101) >> 24;
}
You have to know the size of your integers.
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel
Upvotes: 7
Reputation: 11015
Brian Kerninghan's algorithm to count 1-bits.
unsigned int v; // count the number of bits set in v
unsigned int c; // c accumulates the total bits set in v
for (c = 0; v; c++)
{
v &= v - 1; // clear the least significant bit set
}
Read this and other bit-twiddling hacks here: Bit-twiddling hacks.
Upvotes: 4