Reputation: 77
I am using python 3.8 and numpy 1.17.4. The output of the following piece of code
import time
import sys
import numpy as np
if __name__ == '__main__':
li = np.zeros(5000000,dtype=int)
sys.stdout.write("%s %s\n" % (type(li),type(li[0])))
start = time.process_time()
li += 5
sys.stdout.write("%.6fs\n" % (time.process_time()-start))
li = np.zeros(5000000,dtype=int)
li = list(li)
li = np.array(li)
sys.stdout.write("%s %s\n" % (type(li),type(li[0])))
start = time.process_time()
li += 5
sys.stdout.write("%.6fs\n" % (time.process_time()-start))
looks like
<class 'numpy.ndarray'> <class 'numpy.int64'>
0.037046s
<class 'numpy.ndarray'> <class 'numpy.int64'>
0.003537s
How come the latter is 10x faster to increment?
Upvotes: 2
Views: 106
Reputation: 389
It seems to me that the explanation is the same as here. Indeed, numpy.zeros() seems to be a "lazy" operation. I modified your sample twice:
import time
import sys
import numpy as np
if __name__ == '__main__':
li = np.zeros(5000000,dtype=int)
sys.stdout.write("%s %s\n" % (type(li),type(li[0])))
start = time.process_time()
li += 5
sys.stdout.write("%.6fs\n" % (time.process_time()-start))
li = np.zeros(5000000,dtype=int)
li += 0 #dummy operation
sys.stdout.write("%s %s\n" % (type(li),type(li[0])))
start = time.process_time()
li += 5
sys.stdout.write("%.6fs\n" % (time.process_time()-start))
Upvotes: 1
Reputation: 1439
Your time estimate may not be accurate. It uses a single iteration of the command which may be influenced by other factors. Using %timeit
shows somewhat more consistent results even though the second method has a much higher standard deviation.
First method:
li = np.zeros(5000000,dtype=int)
%timeit zi = li + 5
12 ms ± 179 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Second method:
li = np.zeros(5000000,dtype=int)
li = list(li)
li = np.array(li)
%timeit zi = li + 5
13.3 ms ± 1.05 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
Upvotes: 0