Sandus
Sandus

Reputation: 23

A groupby cumsum that is set to zero and restarts the cumsum when the value is below zero

I'm trying to create a column with cumulative sum that resets to zero when the cumsum gets below zero. I the following data:

id treatment value
1 drugs 66
1 drugs 33
1 drugs -100
1 drugs 11
1 drugs 30
1 drugs -50

The desired result:

id treatment days cumsum
1 drugs 66 66
1 drugs 33 99
1 drugs -100 0
1 drugs 11 11
1 drugs 30 41
1 drugs -50 0

Is there a solution close to this attempt?

df.groupby(['id','treatment']).days.apply(lambda x: 0 if x.cumsum() < 0 else x.cumsum())

Upvotes: 2

Views: 324

Answers (2)

BENY
BENY

Reputation: 323386

I will recommend use this numba function this time:

from numba import njit

@njit
def cumli(x, lim):
    total = 0
    result = []
    for i, y in enumerate(x):
        total += y
        if total < lim:
            total = 0
        result.append(total)
    return result

df['cumsum'] = df.groupby(['id','treatment']).days.transform(lambda x: cumli(x.values,0))

Upvotes: 1

user17242583
user17242583

Reputation:

Building off of @BENY's excellent answer here, you can do this:

df['cumsum'] = df.groupby(df['value'].lt(0).astype(int).diff().ne(0).cumsum())['value'].cumsum().clip(lower=0)

Output:

>>> df
   id treatment  value  cumsum
0   1     drugs     66      66
1   1     drugs     33      99
2   1     drugs   -100       0
3   1     drugs     11      11
4   1     drugs     30      41
5   1     drugs    -50       0

Upvotes: 2

Related Questions