Thando
Thando

Reputation: 31

How do I make my python code for summing much faster?

The code below is for a sum of the defined function from the lower limit 2 and I vary the upper limit of the sum because I want to find out where the sum converges. I found that the bigger the upper limit the slower the code runs. How do I make this code run fast for any large upper limit value? The code is as follows:

def h(i):
    x = (-1)**(i+1)
    y = 1000000000-(i-1)
    z = (log(i))**20
    return x*y*z

 gx = sum(h(i) for i in range (2, 1000000000+1))
 d_gx = gx/1000000000
 print(d_gx)

Upvotes: 0

Views: 233

Answers (1)

erip
erip

Reputation: 16935

Numba is a python library for just-in-time optimization of pure python code without the need for external compilation steps.

The two important functions I'll introduce here are numba.njit and numba.vectorize which are both decorators. njit optimizes an arbitrary pure function and vectorize makes functions operate on both scalars and ndarrays.

In [1]: from numba import vectorize, njit; from math import log

In [2]: def h(i):
   ...:     x = (-1)**(i+1)
   ...:     y = 1000000000-(i-1)
   ...:     z = (log(i))**20
   ...:     return x*y*z
   ...:

In [3]: %timeit sum(h(i) for i in range (2, 1000000+1))
646 ms ± 9.16 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

As you can see here, the naive implementation of your function takes an average of 646ms on a reduced amount of the input space. We can improve this by jitting your function:

In [4]: jit_h = njit()(h)

In [5]: %timeit sum(jit_h(i) for i in range (2, 1000000+1))

179 ms ± 3.88 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
    

We've trimmed this down to 179ms, which is a huge improvement over the original 646ms. Because for-loops are slow, we can try to vectorize the operation using numpy arrays as inputs:

In [6]: vectorize_h = vectorize()(h)

In [7]: %timeit sum(vectorize_h(i) for i in range (2, 1000000+1))
657 ms ± 4.55 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

As expected, vectorizing the input permits passing scalars, but doesn't dramatically improve performance -- in fact, it's a bit slower! What if we operate on an entire numpy array?

In [8]: import numpy as np

In [9]: %timeit sum(vectorize_h(np.arange(2,1000000+1))) 

149 ms ± 1.78 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Finally, what if we replace the sum builtin with numpy ndarray sum?

In [10]: %timeit vectorize_h(np.arange(2,1000000+1)).sum() 
17.2 ms ± 207 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

A further reduction down to 17.2ms -- a huge improvement over the naive implementation.

Upvotes: 3

Related Questions