Reputation: 822
What is the performance overhead of different methods of creating and filling lists in Python?
Upvotes: 1
Views: 798
Reputation: 822
The following is intended to measure the overhead of different methods of creating and filling lists in Python. In a real program you would of course do something more meaningful with the data added to the lists.
To test this, I made a bunch of files called test1.py
, test2.py
, etc. that create and fill a list in different ways. I then ran them with the timeit
module:
python -m timeit "from test1 import foo; foo()"
This was tested with Python 3.6.0 on a laptop PC with a 2.4 GHz CPU.
The results are shown best to worst:
def foo():
a = list(range(10000000))
A question was asked in the comments about the performance of this. Note that this is only useful for filling a list with sequential numbers.
def foo():
a = [i for i in range(10000000)]
def foo():
k = 10000000
a = [0] * k
for i in range(k):
a[i] = i
This pre-allocates the entire list so the for-loop merely fills it without calling list-append. Although the append-function has amortized constant computational time, it is not completely free, because the list has to be grown periodically as it becomes full, which requires allocating new memory and copying the contents of the list. Pre-allocating the list avoids the expense of growing the list.
def foo():
a = list((i for i in range(10000000)))
def foo():
def bar():
for i in range(10000000):
yield i
a = list(bar())
There is a degree of uncertainty in these time measurements and the two generators seemed to use roughly the same amount of time.
def foo():
a = []
for i in range(10000000):
a.append(i)
Upvotes: 4