equanimity
equanimity

Reputation: 2533

Cumulative sum of list up to a threshold value without using np.cumsum() or itertools accumulate

I'm trying to write a function that returns the difference:

a) between a user-defined threshold value and the cumulative sum of a list of integers that exceeds the threshold value, OR,

b) between the threshold value and the sum of the list's elements (assuming we have completely iterated over the list the and the sum of the elements does not exceed the threshold value).

Here's are two examples:

elements = [10, 20, 15, 25, 50]
cumulative_sum = [10, 30, 45, 70, 120]
threshold = 35  
difference = 10

elements = [10, 20, 15, 25, 50]
cumulative_sum = [10, 30, 45, 70, 120]  # 120 does not exceed the threshold value
threshold = 145
difference = 25

The following only works for case B above. It does not work for case A:

temp_list = []
sum = 0
for elem in numbers:
    sum += elem
    temp_list.append(sum)
if sum < threshold:
    pass
    # get the last value in the cumulative sum list that exceeds the threshold value

if sum > threshold:
    print(temp_list[-1] - sum)

I know that we can use np.cumsum() and itertools accumulate to treat case A. But, how would we do this without using Numpy?

Thanks!

##############

EDIT

Inspired by @Mark's comment, I solved this as follows:

def difference(elements, threshold):
    cumsum = 0
    for elem in elements:
        cumsum += elem
        if cumsum >= threshold:
            break
    return abs(cumsum - threshold)

This is essentially the same as the solution provided by @j1-lee (except for the abs function).

Upvotes: 0

Views: 585

Answers (2)

rici
rici

Reputation: 241861

Personally, I'd use itertools.accumulate, which is part of the standard library (which means it's there whether you choose to use it or not). But if you insist on only using "built-in" functions which are not in any module, the following will work, if you have Python 3.8 or more recent (for the "walrus" operator):

def delta(thresh, data):
    return -thresh if any((thresh := thresh - x) < 0 for x in data) else thresh

The use of any causes the implicit loop to terminate as soon as the accumulated threshold goes negative, so if the data argument is a generator, it will only be called as many times as are necessary:

>>> # This returns instantly
>>> delta(17, range(1000000000))
4

Upvotes: 1

j1-lee
j1-lee

Reputation: 13939

You can use a for loop:

def diff_threshold(lst, threshold):
    cumsum = 0
    for x in lst:
        cumsum += x
        if cumsum > threshold:
            return cumsum - threshold
    return threshold - cumsum

print(diff_threshold([10, 20, 15, 25, 50], 35)) # 10
print(diff_threshold([10, 20, 15, 25, 50], 145)) # 25

Upvotes: 1

Related Questions