Performance of list creation in Python

Question

What is the performance overhead of different methods of creating and filling lists in Python?

questiondude · Accepted Answer

The following is intended to measure the overhead of different methods of creating and filling lists in Python. In a real program you would of course do something more meaningful with the data added to the lists.

To test this, I made a bunch of files called test1.py, test2.py, etc. that create and fill a list in different ways. I then ran them with the timeit module:

python -m timeit "from test1 import foo; foo()"

This was tested with Python 3.6.0 on a laptop PC with a 2.4 GHz CPU.

The results are shown best to worst:

Range (237 msec per loop)

def foo():
    a = list(range(10000000))

A question was asked in the comments about the performance of this. Note that this is only useful for filling a list with sequential numbers.

List comprehension (380 msec per loop)

def foo():
    a = [i for i in range(10000000)]

Pre-allocated list (492 msec per loop)

def foo():
    k = 10000000
    a = [0] * k
    for i in range(k):
        a[i] = i

This pre-allocates the entire list so the for-loop merely fills it without calling list-append. Although the append-function has amortized constant computational time, it is not completely free, because the list has to be grown periodically as it becomes full, which requires allocating new memory and copying the contents of the list. Pre-allocating the list avoids the expense of growing the list.

Generator comprehension (573 msec per loop)

def foo():
    a = list((i for i in range(10000000)))

Generator with yield-function (580 msec per loop)

def foo():
    def bar():
        for i in range(10000000):
            yield i

    a = list(bar())

There is a degree of uncertainty in these time measurements and the two generators seemed to use roughly the same amount of time.

List append (827 msec per loop)

def foo():
    a = []
    for i in range(10000000):
        a.append(i)