Reputation: 8059
This surprises me a bit. I've been testing performances.
In [1]: import numpy as np
In [2]: %timeit a = np.sum(range(100000))
Out[2]: 100 loops, best of 3: 16.7 ms per loop
In [3]: %timeit a = np.sum([range(100000)])
Out[3]: 100 loops, best of 3: 16.7 ms per loop
In [4]: %timeit a = np.sum([i for i in range(100000)])
Out[4]: 100 loops, best of 3: 12 ms per loop
In [5]: %timeit a = np.sum((i for i in range(100000)))
Out[5]: 100 loops, best of 3: 8.43 ms per loop
I'm trying to understand the inner working as well as learn how to generalize to have a best practice. Why is 4 (building a new generator) is better than 1?
I understand why creating a list takes more time. But again, why 3 is better than 2? And why isn't 2 worse than 1? Is a list being built at 1?
I'm using a from numpy import *
.
Upvotes: 0
Views: 157
Reputation: 8557
Running the same code, I get these results (Python 3.5.1):
%timeit a = sum(range(100000))
100 loops, best of 3: 3.05 ms per loop
%timeit a = sum([range(100000)])
>>> TypeError: unsupported operand type(s) for +: 'int' and 'range'
%timeit a = sum([i for i in range(100000)])
100 loops, best of 3: 8.12 ms per loop
%timeit a = sum((i for i in range(100000)))
100 loops, best of 3: 8.97 ms per loop
Now with numpy's sum()
implementation:
from numpy import sum
%timeit a = sum(range(100000))
10 loops, best of 3: 19.7 ms per loop
%timeit a = sum([range(100000)])
10 loops, best of 3: 20.2 ms per loop
%timeit a = sum([i for i in range(100000)])
100 loops, best of 3: 16.2 ms per loop
%timeit a = sum((i for i in range(100000)))
100 loops, best of 3: 9.27 ms per loop
What's happened is that by using from numpy import *
(or from numpy import sum
) you're clobbering Python's built-in sum()
function.
Have a look at this SO question which discusses performance comparison between the two implementations.
Upvotes: 2