Reputation: 17173

Compute fast log base 2 ceiling in python

for given x < 10^15, quickly and accurately determine the maximum integer p such that 2^p <= x

Here are some things I've tried:

First I tried this but it's not accurate for large numbers:

>>> from math import log
>>> x = 2**3
>>> x
8
>>> p = int(log(x, 2))
>>> 2**p == x
True
>>> x = 2**50
>>> p = int(log(x, 2))
>>> 2**p == x #not accurate for large numbers?
False

I could try something like:

p = 1
i = 1
while True:
    if i * 2 > n:
        break
    i *= 2
    p += 1
    not_p = n - p

Which would take up to 50 operations if p was 50

I could pre-compute all the powers of 2 up until 2^50, and use binary search to find p. This would take around log(50) operations but seems a bit excessive and ugly?

I found this thread for C based solutions: Compute fast log base 2 ceiling

However It seems a bit ugly and I wasn't exactly sure how to convert it to python.

Upvotes: 17

Answers (7)

DSM

Reputation: 353059

In Python >= 2.7, you can use the .bit_length() method of integers:

def brute(x):
    # determine max p such that 2^p <= x
    p = 0
    while 2**p <= x:
        p += 1
    return p-1

def easy(x):
    return x.bit_length() - 1

which gives

>>> brute(0), brute(2**3-1), brute(2**3)
(-1, 2, 3)
>>> easy(0), easy(2**3-1), easy(2**3)
(-1, 2, 3)
>>> brute(2**50-1), brute(2**50), brute(2**50+1)
(49, 50, 50)
>>> easy(2**50-1), easy(2**50), easy(2**50+1)
(49, 50, 50)
>>> 
>>> all(brute(n) == easy(n) for n in range(10**6))
True
>>> nums = (max(2**x+d, 0) for x in range(200) for d in range(-50, 50))
>>> all(brute(n) == easy(n) for n in nums)
True

Upvotes: 38

Alex Mykyta

Reputation: 151

Be careful! The accepted answer returns floor(log(n, 2)), NOT ceil(log(n, 2)) like the title of the question implies!

If you came here for a clog2 implementation, do this:

def clog2(x):
    """Ceiling of log2"""
    if x <= 0:
        raise ValueError("domain error")
    return (x-1).bit_length()

And for completeness:

def flog2(x):
    """Floor of log2"""
    if x <= 0:
        raise ValueError("domain error")
    return x.bit_length() - 1

Upvotes: 9

cefn

Reputation: 3321

I needed to calculate the upper bound power of two (to figure out how many bytes of entropy was needed to generate a random number in a given range using the modulus operator).

From a rough experiment I think the calculation below gives the minimum integer p such that val < 2^p

It's probably about as fast as you can get, and uses exclusively bitwise integer arithmetic.

def log2_approx(val):
    from math import floor
    val = floor(val)
    approx = 0
    while val != 0:
        val &= ~ (1<<approx)
        approx += 1
    return approx

Your slightly different value would be calculated for a given n by

log2_approx(n) - 1

...maybe. But in any case, the bitwise arithmetic could give you a clue how to do this fast.

Upvotes: 0

Bob Stein

Reputation: 17214

You specify in comments your x is an integer, but for anyone coming here where their x is already a float, then math.frexp() would be pretty fast at extracting log base 2:

log2_slow = int(floor(log(x, 2)))
log2_fast = frexp(x)[1]-1

The C function that frexp() calls just grabs and tweaks the exponent. Some more 'splainin:

The subscript[1] is because frexp() returns a tuple (significand, exponent).
The subtract-1 accounts for the significand being in the range [0.5,1.0). For example 2⁵⁰ is stored as 0.5x2⁵¹.
The floor() is because you specified 2^p <= x, so p == floor(log(x,2)).

(Derived from another answer.)

Upvotes: 6

Brian Cain

Reputation: 14619

With respect to "not accurate for large numbers" your challenge here is that the floating point representation is indeed not as precise as you need it to be (49.999999999993 != 50.0). A great reference is "What Every Computer Scientist Should Know About Floating-Point Arithmetic."

The good news is that the transformation of the C routine is very straightforward:

def getpos(value):
    if (value == 0):
        return -1
    pos = 0
    if (value & (value - 1)):
        pos = 1
    if (value & 0xFFFFFFFF00000000):
        pos += 32
        value = value >> 32
    if (value & 0x00000000FFFF0000):
        pos += 16
        value = value >> 16
    if (value & 0x000000000000FF00):
        pos += 8
        value = value >> 8
    if (value & 0x00000000000000F0):
        pos += 4
        value = value >> 4
    if (value & 0x000000000000000C):
        pos += 2
        value = value >> 2
    if (value & 0x0000000000000002):
        pos += 1
        value = value >> 1
    return pos

Another alternative is that you could round to the nearest integer, instead of truncating:

   log(x,2)
=> 49.999999999999993
   round(log(x,2),1)
=> 50.0

Upvotes: 2

Russell Borogove

Reputation: 19037

Works for me, Python 2.6.5 (CPython) on OSX 10.7:

>>> x = 2**50
>>> x
1125899906842624L
>>> p = int(log(x,2))
>>> p
50
>>> 2**p == x
True

It continues to work at least for exponents up to 1e9, by which time it starts to take quite a while to do the math. What are you actually getting for x and p in your test? What version of Python, on what OS, are you running?

Upvotes: 2

BrenBarn

Reputation: 251368

You could try the log2 function from numpy, which appears to work for powers up to 2^62:

>>> 2**np.log2(2**50) == 2**50
True
>>> 2**np.log2(2**62) == 2**62
True

Above that (at least for me) it fails due to the limtiations of numpy's internal number types, but that will handle data in the range you say you're dealing with.

Upvotes: 3

Compute fast log base 2 ceiling in python

Answers (7)

Related Questions