GWW
GWW

Reputation: 44093

Could someone help explain what this C one liner does?

I can usually figure out most C code but this one is over my head.

#define kroundup32(x) (--(x), (x)|=(x)>>1, (x)|=(x)>>2, (x)|=(x)>>4, (x)|=(x)>>8, (x)|=(x)>>16, ++(x))

an example usage would be something like:

int x = 57;
kroundup32(x);
//x is now 64

A few other examples are:

1 to 1
2 to 2
7 to 8
31 to 32
60 to 64
3000 to 4096

I know it's rounding an integer to it's nearest power of 2, but that's about as far as my knowledge goes.

Any explanations would be greatly appreciated.

Thanks

Upvotes: 13

Views: 886

Answers (3)

thomasrutter
thomasrutter

Reputation: 117363

(--(x), (x)|=(x)>>1, (x)|=(x)>>2, (x)|=(x)>>4, (x)|=(x)>>8, (x)|=(x)>>16, ++(x))
  1. Decrease x by 1
  2. OR x with (x / 2).
  3. OR x with (x / 4).
  4. OR x with (x / 16).
  5. OR x with (x / 256).
  6. OR x with (x / 65536).
  7. Increase x by 1.

For a 32-bit unsigned integer, this should move a value up to the closest power of 2 that is equal or greater. The OR sections set all the lower bits below the highest bit, so it ends up as a power of 2 minus one, then you add one back to it. It looks like it's somewhat optimized and therefore not very readable; doing it by bitwise operations and bit shifting alone, and as a macro (so no function call overhead).

Upvotes: 20

ony
ony

Reputation: 13223

At my machine kroundup32 gives 6.000m rounds/sec
And next function gives 7.693m rounds/sec

inline int scan_msb(int x)
{
#if defined(__i386__) || defined(__x86_64__)
    int y;
    __asm__("bsr %1, %0"
            : "=r" (y)
            : "r" (x)
            : "flags"); /* ZF */
    return y;
#else
#error "Implement me for your platform"
#endif
}

inline int roundup32(int x)
{
    if (x == 0) return x;
    else {
        const int bit = scan_msb(x);
        const int mask = ~((~0) << bit);
        if (x & mask) return (1 << (bit+1));
        else return (1 << bit);
    }
}

So @thomasrutter I woudn't say that it is "highly optimized".

And appropriate (only meaningful part) assembly (for GCC 4.4.4):

kroundup32:
    subl    $1, %edi
    movl    %edi, %eax
    sarl    %eax
    orl %edi, %eax
    movl    %eax, %edx
    sarl    $2, %edx
    orl %eax, %edx
    movl    %edx, %eax
    sarl    $4, %eax
    orl %edx, %eax
    movl    %eax, %edx
    sarl    $8, %edx
    orl %eax, %edx
    movl    %edx, %eax
    sarl    $16, %eax
    orl %edx, %eax
    addl    $1, %eax
    ret

roundup32:
    testl   %edi, %edi
    movl    %edi, %eax
    je  .L6
    movl    $-1, %edx
    bsr %edi, %ecx
    sall    %cl, %edx
    notl    %edx
    testl   %edi, %edx
    jne .L10
    movl    $1, %eax
    sall    %cl, %eax
.L6:
    rep
    ret
.L10:
    addl    $1, %ecx
    movl    $1, %eax
    sall    %cl, %eax
    ret

By some reason I haven't found appropriate implementation of scan_msb (like #define scan_msb(x) if (__builtin_constant_p (x)) ...) within standart headers of GCC (only __TBB_machine_lg/__TBB_Log2).

Upvotes: 6

Drew Hall
Drew Hall

Reputation: 29055

The bitwise or and shift operations essentially set every bit between the highest set bit and bit zero. This will produce a number of the form 2^n - 1. The final increment adds one to get a number of the form 2^n. The initial decrement ensures that you don't round numbers which are already powers of two up to the next power, so that e.g. 2048 doesn't become 4096.

Upvotes: 6

Related Questions