Reputation: 6612
I'm trying to understand the performance of list comprehensions in Python, and the trade-offs of using them versus for loops to create lists. One of the known performance costs of using a for loop to append elements to a list is that at each iteration it is O(k) (where k is the length of the list) because the append needs to get to the end of the list to add an additional element.
How does this work for list comprehensions? At each iteration does it need to get to the end of the new list to append a new element?
# For loop:
# O(n*k) (k=number of elements currently in list) time complexity:
new_list = []
for i in range(n): # O(n)
new_list.append(i) # O(k)
# List comprehension:
new_list = [i for i in range(n)] # Is this O(n)?
I've searched the Python documentation, Stack Overflow, and other websites and couldn't find any information about this. There are many resources for more higher level information about list comprehensions but nothing specific like this.
If you can't provide an answer can you please direct me to, or show me how to look at the actual underlying Python list comprehension code so I can do this on my own?
Upvotes: 0
Views: 1119
Reputation: 160447
Appending to a list is amortized O(1)
and not O(k)
; lists are implemented as variable-length arrays, not as linked-lists. The complexity applies both for for
loops with my_list.append
calls and list-comprehensions (which, spoiler alert, also append).
So in both cases. The complexity is O(N)
.
List-comprehensions generally perform better because they are specialized to do one thing: create lists. The byte-code produced for them is specific to that. (See the LIST_APPEND
bytecode)
Also note that list-comprehensions, like for-loops, don't necessarily append at each iteration. Using if
clauses to filter out elements of the iterable you're looping through is commonly used.
If you'd like to see how list-comprehensions are implemented in CPython, you can take a look at the bytecode produced for them and scan through ceval.c
for the actions performed for each.
The byte-code can be seen with dis
after we compile a list-comprehension expression:
dis(compile('[i for i in range(10)]', '', 'exec').co_consts[0])
1 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 8 (to 14)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LIST_APPEND 2
12 JUMP_ABSOLUTE 4
>> 14 RETURN_VALUE
Then, scan through the cases in ceval.c
or look at their documentation in the dis
module.
Upvotes: 5