Reputation: 44093
I can usually figure out most C code but this one is over my head.
#define kroundup32(x) (--(x), (x)|=(x)>>1, (x)|=(x)>>2, (x)|=(x)>>4, (x)|=(x)>>8, (x)|=(x)>>16, ++(x))
an example usage would be something like:
int x = 57;
kroundup32(x);
//x is now 64
A few other examples are:
1 to 1
2 to 2
7 to 8
31 to 32
60 to 64
3000 to 4096
I know it's rounding an integer to it's nearest power of 2, but that's about as far as my knowledge goes.
Any explanations would be greatly appreciated.
Thanks
Upvotes: 13
Views: 886
Reputation: 117363
(--(x), (x)|=(x)>>1, (x)|=(x)>>2, (x)|=(x)>>4, (x)|=(x)>>8, (x)|=(x)>>16, ++(x))
For a 32-bit unsigned integer, this should move a value up to the closest power of 2 that is equal or greater. The OR sections set all the lower bits below the highest bit, so it ends up as a power of 2 minus one, then you add one back to it. It looks like it's somewhat optimized and therefore not very readable; doing it by bitwise operations and bit shifting alone, and as a macro (so no function call overhead).
Upvotes: 20
Reputation: 13223
At my machine kroundup32
gives 6.000m rounds/sec
And next function gives 7.693m rounds/sec
inline int scan_msb(int x)
{
#if defined(__i386__) || defined(__x86_64__)
int y;
__asm__("bsr %1, %0"
: "=r" (y)
: "r" (x)
: "flags"); /* ZF */
return y;
#else
#error "Implement me for your platform"
#endif
}
inline int roundup32(int x)
{
if (x == 0) return x;
else {
const int bit = scan_msb(x);
const int mask = ~((~0) << bit);
if (x & mask) return (1 << (bit+1));
else return (1 << bit);
}
}
So @thomasrutter I woudn't say that it is "highly optimized".
And appropriate (only meaningful part) assembly (for GCC 4.4.4):
kroundup32:
subl $1, %edi
movl %edi, %eax
sarl %eax
orl %edi, %eax
movl %eax, %edx
sarl $2, %edx
orl %eax, %edx
movl %edx, %eax
sarl $4, %eax
orl %edx, %eax
movl %eax, %edx
sarl $8, %edx
orl %eax, %edx
movl %edx, %eax
sarl $16, %eax
orl %edx, %eax
addl $1, %eax
ret
roundup32:
testl %edi, %edi
movl %edi, %eax
je .L6
movl $-1, %edx
bsr %edi, %ecx
sall %cl, %edx
notl %edx
testl %edi, %edx
jne .L10
movl $1, %eax
sall %cl, %eax
.L6:
rep
ret
.L10:
addl $1, %ecx
movl $1, %eax
sall %cl, %eax
ret
By some reason I haven't found appropriate implementation of scan_msb
(like #define scan_msb(x) if (__builtin_constant_p (x)) ...
) within standart headers of GCC (only __TBB_machine_lg
/__TBB_Log2
).
Upvotes: 6
Reputation: 29055
The bitwise or and shift operations essentially set every bit between the highest set bit and bit zero. This will produce a number of the form 2^n - 1
. The final increment adds one to get a number of the form 2^n
. The initial decrement ensures that you don't round numbers which are already powers of two up to the next power, so that e.g. 2048 doesn't become 4096.
Upvotes: 6