Reputation: 7649
I have seen questions similar to this, but not one directly addressing the issue. I have timed the following two ways of populating the array and half the time using np.zeros() is faster and half the time doing it directly is faster. Is there a preferable way? I am quite new to using numpy arrays, and have gotten involved with the aim of speeding up my code rather without too much thought to readability.
import numpy as np
import time
lis = range(100000)
timer = time.time()
list1 = np.array(lis)
print 'normal array creation', time.time() - timer, 'seconds'
timer = time.time()
list2 = np.zeros(len(lis))
list2.fill(lis)
print 'zero, fill - array creation', time.time() - timer, 'seconds'
Thank you
Upvotes: 6
Views: 4764
Reputation: 2108
np.fromiter
will pre-allocate the output array if given the number of elements:
a = [x/10. for x in range(100000)] # 10.3ms
np.fromiter(a, dtype=np.float) # 3.33ms
np.fromiter(a, dtype=np.float, count=100000) # 3.03ms
Upvotes: 2
Reputation: 5599
The first list can be created faster with the arange
numpy function:
list3 = np.arange(100000)
You can also find useful the linspace
function.
Upvotes: 1
Reputation: 212825
If you have a list of floats a=[x/10. for x in range(100000)]
, then you can create an array with:
np.array(a) # 9.92ms
np.fromiter(a, dtype=np.float) # 5.19ms
Your approach
list2 = np.zeros(len(lis))
list2.fill(lis)
won't work as expected. The .fill
fills the whole array with one value.
Upvotes: 6
Reputation: 34314
Your list2
example simply doesn't work—if you inspect list2
, you'll find that it still contains all zeroes. I find that pursuing readability is not just a good aim in and of itself. It also results in an increased likelihood of correct code.
Upvotes: 1