Reputation: 89
Suppose I have a list:
lst = [0, 1, 2, 3, 4, 5, 6...]
I want to iterate over this list in an expanding manner so that every iteration inside a loop I get:
1 iteration: [0]
2 iteration: [0, 1]
3 iteration: [0, 1, 2]
4 iteration: [0, 1, 2, 3]
...
This is basically to emulate a real time growing time series.
I pass this slice of lst to another function to make some computations.
Now I do it like this:
for i in range(1, len(lst)):
my_func(lst[:i], other_variables)
Boy this is slow. My list has above 1M elements. I know that slicing makes a shallow copy of list, maybe I can somehow avoid copies, or use some kind of different approach.
I also tried this:
lst2 = []
for i in range(len(lst)):
lst2.append(lst[i])
my_func(lst2, other_variables)
A little bit better but still pretty slow.
I also tried deque() instead of list.
Upvotes: 0
Views: 860
Reputation: 19811
On top of what has been mentioned in this answer, we can try to avoid dots to improve the performance.
lst2 = []
append = lst2.append
for c in lst:
append(c)
my_func(lst2, other_variables)
We can see the performance gain with a mocked my_func
:
def my_func(kt):
pass
def method1():
lst2 = []
for c in lst:
lst2.append(c)
my_func(lst2)
def method2():
lst2 = []
append = lst2.append
for c in lst:
append(c)
my_func(lst2)
In [11]: %timeit method1()
84.8 ms ± 4.73 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [12]: %timeit method2()
67 ms ± 262 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Upvotes: 1
Reputation: 531035
As i
gets bigger, lst[:i]
gets more and more expensive. But you don't need to create a new slice at each iteration; just append the next value of lst
to a pre-existing prefix.
pfx = []
for c in lst:
pfx.append(c)
my_func(pfx, other_variables)
You could use itertools.accumualte
, except Python doesn't have a nice expression for appending a value to a list and returning the updated list. pfx + [c]
is just as slow lst[:i]
.
Upvotes: 1