Reputation: 31
Execute these lines :
import numpy as np
import time
start = time.time()
t = []
for i in range(int(1e5)):
t.append(i)
t = np.array(t)
end = time.time()
print(end-start)
And compare with these :
import numpy as np
import time
start = time.time()
t = np.array([])
for i in range(int(1e5)):
t = np.append(t,[i])
end = time.time()
print(end-start)
The first is faster than the second by approximatively a factor 100 !
What is the reason ?
Upvotes: 2
Views: 3787
Reputation: 1330
According to this article.
He appends 99 999 numbers using both Python list append() and NumPy append(). Results:
The computation time of the NumPy array: 2.779465675354004
The computation time of the list: 0.010703325271606445
With NumPy.
Appending process does not occur in the same array. Rather a new array is created and filled.
With Python.
Things are very different. The list filling process stays within the list itself, and no new lists are generated.
Upvotes: 4
Reputation: 77337
Python lists hold references to objects. These references are contiguous in memory, but python allocates its reference array in chunks, so only some appends require a copy. Numpy does not preallocate extra space, so the copy happens every time. And since all of the columns need to maintain the same length, they are all copied on each append.
Upvotes: 4