Daffy
Daffy

Reputation: 851

Impossible speedups when using JIT in python. What's going on?

I'm trying to test the speed improvement of using JIT in python. Here's the code I'm using.

from numba import jit
import timeit

@jit # Commented out when testing pure python
def sumof():
    x = 0
    for i in xrange(1000000000):
        x += 1
    return x

def timer():
    sumof() # Run once to initialize the JIT compiler
    l = []
    t = timeit.default_timer()
    for x in range(10):
        l.append(sumof())
    return timeit.default_timer()-t, l # Returns the time elapsed and the list of results, to verify accuracy

print timer()

This gives a result similar to this

(5.643910299113486e-06, [1000000000, 1000000000, 1000000000, 1000000000, 1000000000, 1000000000, 1000000000, 1000000000, 1000000000, 1000000000])

Now I know raw CPU performance is orders of magnitude faster than pure python, but 10 billion operations in 5 microseconds?? I tested this same code, but using the maximum value of a signed 64 bit integer instead of just a billion. This was the result.

(5.643909389618784e-06, [9223372036854775807L, 9223372036854775807L, 9223372036854775807L, 9223372036854775807L, 9223372036854775807L, 9223372036854775807L, 9223372036854775807L, 9223372036854775807L, 9223372036854775807L, 9223372036854775807L])

~ 92 quadrillion operations... in 5 microseconds. By my math, that's about 16 septillion operations a second. Something is obviously wrong, yet it's giving the correct result. I suspect the compiler is getting rid of the for loop somehow, but why? And how can I prevent it?

Upvotes: 0

Views: 804

Answers (1)

torek
torek

Reputation: 488453

It seems pretty likely that the optimizer has replaced the loop with a single constant calculation. See https://en.wikipedia.org/wiki/Loop_optimization for a list of classic loop optimizations. In this case, completely unrolling the loop, then combining all the constants, results in return n (with x += 1) or return n * b (with x += b). Using x += i results in return n * (n + 1) / 2. (In each case n is the appropriate upper loop bound: when summing i in range(n), it's really just n-1 instead.)

Because this is a JIT compiler, it can do this even for variable n, although in your examples each n is a constant, so even non-JIT compilers can do this.

Upvotes: 1

Related Questions