Breaking early when computing cumulative products or sums in numpy

Question

Say I have a range r=numpy.array(range(1, 6)) and I am calculating the cumulative sum using numpy.cumsum(r). But instead of returning [1, 3, 6, 10, 15] I would like it to return [1, 3, 6] because of the condition that the cumulative result must be less than 10.

If the array is very large, I would like the cumulative sum to break out before it starts calculating values that are redundant and will be thrown away later. Of course, I am trivializing everything here for the sake of the question.

Is it possible to break out of cumsum or cumprod early based on a condition?

Bas Swinckels · Accepted Answer

I don't think this is possible with any function in numpy, since in most cases these are meant for vectorized computations on fixed-length arrays. One obvious way to do what you want is to break out of a standard for-loop in Python (as I assume you know):

def limited_cumsum(x, limit):
    y = []
    sm = 0
    for item in x:
        sm += item
        if sm > limit:
            return y
        y.append(sm)
    return y

But this would obviously be an order of magnitude slower than numpy's cumsum.

Since you probably need some very specialized function, the changes are low to find the exact function you need in numpy. You should probably have a look at Cython, which allows you to implement custom functions that are as flexible as a Python function (and using a syntax that is almost Python), with a speed close to that of C.

Breaking early when computing cumulative products or sums in numpy

Answers (2)

Related Questions