Vincent
Vincent

Reputation: 60381

Compile-time recursive function to compute the next power of two of an integer?

On the Bit Twiddling Hacks website the following algorithm is provided to round up an integer to the next power of two:

unsigned int v; // compute the next highest power of 2 of 32-bit v
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
v++;

I would like to code a metaprogramming function that will compute the same operation:

and here is the form of the expected function:

template <typename Type,
          // Something here (like a recursion index)
          class = typename std::enable_if<std::is_integral<Type>::value>::type,
          class = typename std::enable_if<std::is_unsigned<Type>::value>::type>
constexpr Type function(const Type value)
{
     // Something here
}

How to do that ?

Example: for value = 42 it should return 64

Upvotes: 8

Views: 2488

Answers (4)

Cloud
Cloud

Reputation: 1411

If you are familiar with ceil_div, that is ceil(float(x)/n) * n, its implementation in integer domain is as simple as (x-1)/n * n

next_power_of_two is basically 2^ceil(log2(float(x)))

constexpr uint32_t ceil_log2(uint32_t n) {
  return n <= 1 ? 0 : 1 + ceil_log2((n + 1) / 2);
}

constexpr uint32_t next_power_of_two(uint32_t n) {
    return uint32_t(1) << ceil_log2(n);
}

Note the fast implementation is actually an unrolled version of previous logic.

unsigned int v;
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
v++;

See godbolt for more details.

Upvotes: 0

Graham Palmer
Graham Palmer

Reputation: 53

Although less efficient in the general, this algorithm will do the job rather concisely:

template <typename T,
          typename = typename std::enable_if<std::is_integral<T>::value>::type,
          typename = typename std::enable_if<std::is_unsigned<T>::value>::type>
constexpr T upperPowerOfTwo(T value, size_t pow = 0)
{
    return (value >> pow) ? upperPowerOfTwo(value, pow + 1)
                          : T(1) << pow;
}

This also allows you to specify a minimum power of 2 - i.e. upperPowerOfTwo(1, 3) return 8.

The reason this is less efficient for most cases is that it makes O(sizeof(Type)*CHAR_BIT) calls whereas the algorithm you linked performs O(log(sizeof(Type)*CHAR_BIT)) operations. The caveat is that this algorithm will terminate after log(v) calls, so if v is small enough (i.e. < log(max value of v-type)) it will be faster.

Upvotes: 1

Howard Hinnant
Howard Hinnant

Reputation: 218750

This may not be what you can do unfortunately. But if by any chance you have a constexpr count leading zeros compiler intrinsic, the following is very efficient both at compile time, and at run time if you happen to give it run time arguments:

#include <climits>

template <class Int>
inline
constexpr
Int
clp2(Int v)
{
    return v > 1 ? 1 << (sizeof(Int)*CHAR_BIT - __builtin_clz(v-1)) : v;
}

int
main()
{
    static_assert(clp2(0) == 0, "");
    static_assert(clp2(1) == 1, "");
    static_assert(clp2(2) == 2, "");
    static_assert(clp2(3) == 4, "");
    static_assert(clp2(4) == 4, "");
    static_assert(clp2(5) == 8, "");
    static_assert(clp2(6) == 8, "");
    static_assert(clp2(7) == 8, "");
    static_assert(clp2(8) == 8, "");
    static_assert(clp2(42) == 64, "");
}

I compiled the above with tip-of-trunk clang. It is not without its issues. You need to decide what you want to do with negative arguments. But many architectures and compilers have an intrinsic like this (shame it isn't standard C/C++ by now). And some of those may make the intrinsic constexpr.

Without such an intrinsic, I would fall back to something along the lines of Adam H. Peterson's algorithm. But the nice thing about this one is its simplicity and efficiency.

Upvotes: 3

Adam H. Peterson
Adam H. Peterson

Reputation: 4591

This ought to implement the algorithm you give:

template<typename T>
constexpr T roundup_helper( T value, unsigned maxb, unsigned curb ) {
    return maxb<=curb
            ? value
            : roundup_helper( ((value-1) | ((value-1)>>curb))+1, maxb, curb << 1 )
            ;
}

template<typename T,
        typename = typename enable_if<is_integral<T>::value>::type,
        typename = typename enable_if<is_unsigned<T>::value>::type>
constexpr T roundup( T value ) {
    return roundup_helper( value, sizeof(T)*CHAR_BIT, 1 );
}

At least, it seems to work fine in my test program.

Alternatively, you can move the v-1 and v+1 out of the helper function like so:

template<typename T>
constexpr T roundup_helper( T value, unsigned maxb, unsigned curb ) {
    return maxb<=curb
            ? value
            : roundup_helper( value | (value>>curb), maxb, curb << 1 )
            ;
}

template<typename T,
        typename = typename enable_if<is_integral<T>::value>::type,
        typename = typename enable_if<is_unsigned<T>::value>::type>
constexpr T roundup( T value ) {
    return roundup_helper( value-1, sizeof(T)*CHAR_BIT, 1 )+1;
}

Another possibility is to take advantage of default arguments and put it all in a single function:

template<typename T,
        typename = typename enable_if<is_integral<T>::value>::type,
        typename = typename enable_if<is_unsigned<T>::value>::type>
constexpr T roundup(
        T value,
        unsigned maxb = sizeof(T)*CHAR_BIT,
        unsigned curb = 1
        ) {
    return maxb<=curb
            ? value
            : roundup( ((value-1) | ((value-1)>>curb))+1, maxb, curb << 1 )
            ;
}

Upvotes: 10

Related Questions