eold
eold

Reputation: 6042

Efficient Division by Least Significant Bit set in C/C++

I would like to perform the following operation as quickly as possible

x / LSB(x)  

where x is an integral value unknown at compile time and LSB(x) = x & -x. (Alternatively, the operation is equivalent to an even division by the highest power of 2 <= x.) I am looking for a reasonably portable solution (without compiler intrinsics/builtins like GCC's __builtin_clz or alike).

My concern is that the following simple implementation

x / (x & -x)

would still result in an expensive division as compiler might fail to realize that the division is in fact equivalent to right-shift by the number of trailing zeroes in the divisor.

If my concerns are reasonable, what would be a more efficient way to implement it?

I would appreciate a solution that is easily extendible to integral types of sizes 32-bit, 64-bits, 128-bits, ...

Upvotes: 2

Views: 500

Answers (2)

Apriori
Apriori

Reputation: 2306

If you don't want to rely on CLZ (count leading zeros) hardware instructions, you can count leading zeros as described in this answer. It's very fast with a look-up and multiplication by a magic number. I'll re-post the code here:

unsigned x;  // input to clz
unsigned c;  // output of clz
static const unsigned MultiplyDeBruijnBitPosition[32] = 
{
  0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 
  31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
};
c = MultiplyDeBruijnBitPosition[((unsigned)((x & -x) * 0x077CB531U)) >> 27];

Once you have counted the leading zeros, you no loner need to use a division instruction. Instead, you can just shift the value right by c. That is (eliminating an unneeded temporary value), the code becomes this:

static const unsigned MultiplyDeBruijnBitPosition[32] = 
{
  0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 
  31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
};

x >>= MultiplyDeBruijnBitPosition[((unsigned)((x & -x) * 0x077CB531U)) >> 27]; // x /= LSB(x)  

Upvotes: 0

pat
pat

Reputation: 12749

How about

x >>= ffs(x)-1;

The ffs function conforms to 4.3BSD, POSIX.1-2001.

It won't work if x is 0.

Upvotes: 1

Related Questions