Reputation: 25
I was trying to create an example where preallocating a numpy array is quicker than using .append on a list, but ended up proving the opposite.
import numpy as np
import time
t = time.time()
emptyList = []
for i in range(int(1e8)):
emptyList.append(1)
print("Append: ", time.time() - t)
t = time.time()
emptyList = np.empty(int(1e8))
for i in range(int(1e8)):
emptyList[i] = 1
print("Preallocate: ", time.time() - t)
Which gives the following output:
Append: 12.733097314834595
Preallocate: 19.008652925491333
I'm very confused as to why appending to a list is quicker in this case. I've also tried using random numbers instead of '1' everywhere, but got the same result.
Upvotes: 1
Views: 1639
Reputation: 2279
It is not possible to edit @Prune example, I offer this snippet of code in the attempt to propose an inclusive and complete example of her point:
import time
nb_rep = 1e7
t = time.time()
emptyList = []
for i in range(int(nb_rep)):
emptyList.append(1)
print("Append: ", time.time() - t)
t = time.time()
emptyList = [1] * int(nb_rep)
for i in range(int(nb_rep)):
emptyList[i] = 1
print("Preallocate: ", time.time() - t)
t = time.time()
emptyList = np.empty(int(nb_rep))
for i in range(int(nb_rep)):
emptyList[i] = 1
print("Preallocate Numpy: ", time.time() - t)
the result locally:
Append: 0.6769435405731201
Preallocate: 0.5485126972198486
Preallocate Numpy: 0.9414911270141602
Upvotes: 0
Reputation: 77867
Please note that you're not using the same class in both examples.
You're comparing append
for the built-in Python type list
against assignment to -- not a list, but to allocate and assign to a NumPy array.
I would expect them to be much closer, but that's not the issue: you have two items here, allocation and assignment.
When I run your code as given, I get
Append: 16.296499013900757
Preallocate: 16.835283517837524
When I change the second loop to eliminate the NumPy overhead:
emptyList = [0] * (int(1e8))
Result:
Append: 23.76796054840088
Preallocate: 11.050438404083252
Upvotes: 4
Reputation: 13672
It's because python lists aren't actually linked lists; they are implemented underneath as arrays: https://docs.python.org/3/faq/design.html#how-are-lists-implemented-in-cpython
When items are appended or inserted, the array of references is resized. Some cleverness is applied to improve the performance of appending items repeatedly; when the array must be grown, some extra space is allocated so the next few times don’t require an actual resize.
They are being smarter than simply increasing each time a single append is requested, similar to how TCP works with a multiplicative increase.
Other reading: https://www.geeksforgeeks.org/internal-working-of-list-in-python/
How is Python's List Implemented?
Upvotes: 4