Reputation: 7990
Is there some way to avoid the for
loop in this code:
X = numpy array ~8000 long
running_total = 0
for x in X:
this_step = min(x, running_total)
running_total += this_step
In words, that's calculating a cumulative sum of a series where the difference between samples is limited to the previous value of the cumulative sum. Is there some way to do this using numpy's element-wise operations? Or should I be looking to write a C function if I care how fast this runs?
Edit Obviously I've not made myself clear. My attempt to simplify the code above seems to have caused more confusion than anything else. Here's something closer to the actual code:
monthly_income = numpy array ~ 8000 long
monthly_expenditure = numpy array ~ 8000 long
running_credit = numpy.zeros(len(monthly_income) + 1)
monthly_borrowing = numpy.zeros(len(monthly_income))
for index, i, e in zip(range(len(monthly_income)), monthly_income, monthly_expenditure):
assets_used = max(0, e - i)
assets_used = min(assets_used, running_credit[index])
monthly_borrowing[index] = max(0, e - i - running_credit[index])
running_credit[index+1] += max(0, i - e) - assets_used
The point is that running_index[index+1]
depends on assets_used
at sample index
, which depends on running_credit[index]
. Doing this in a Python loop is slow - in a function which does many similar calculations on the same input arrays using NumPy operations, the above loop takes over 80% of the execution time. But I can't see a way of doing the above operation without a for loop.
So is there some way of doing this kind of iterative operation in NumPy without a for loop? Or do I need to write a C function if I want this to run fast?
Upvotes: 4
Views: 1020
Reputation: 1690
Easiest quick-fix for this kind of problem i find is to use numba. E.g.
from numba import jit
import numpy as np
def cumsumcapped(X):
running_total = 0
for x in X:
this_step = min(x, running_total)
running_total += this_step
@jit
def cumsumcappedjit(X):
running_total = 0
for x in X:
this_step = min(x, running_total)
running_total += this_step
X = np.random.randint(1, 100, 10000)
In [5]: %timeit cumsumcapped(X)
100 loops, best of 3: 4.1 ms per loop
In [6]: %timeit stack.cumsumcappedjit(X)
The slowest run took 170143.03 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 587 ns per loop
Upvotes: 1
Reputation: 4076
Not sure but guessing from your explanation (assuming x is nonnegative)
X = [1 ... 999]
running_total = X[0]
for x in X[1:]:
this_step = min(x, running_total)
running_total += this_step
Upvotes: 0