Reputation: 15
I'm trying to calculate a running total that reset to 0 when it reaches a certain threshold. I'm pretty new to python and gave it a shot. I feel like i'm on the right track, but could use a little help. Any pointers / solutions are greatly appreciated.
Sample Data
Date Team Amount Threshold
1/9/2020 Team1 364970 5000000
7/18/2020 Team2 762532 5000000
6/15/2020 Team1 497686 5000000
3/14/2020 Team2 762980 5000000
2/14/2020 Team2 793256 5000000
my attempt:
df = sheets.get_cells(sheet, ws, has_col_header = True)
df["Amount"] = df["Amount"].astype(int)
df['Running Total'] = df.groupby('Team').cumsum().astype(int)
df["Threshold"] = df["Threshold"].astype(int)
df = df.sort_values(by=['Team','Date'])
df.groupby('Team').apply(lambda x: (x['Amount'].cumsum() *(x['Budget'] > x['Running Total'].astype(int)).shift(-1)).fillna(x['Running Total']))
Upvotes: 1
Views: 791
Reputation: 8780
Try this and see if it's what you're looking for.
thresh = 5000000
df['cs'] = df.groupby((df['Amount'].cumsum()) // thresh)['Amount'].cumsum()
If you need to groupby Team
, then:
df['cs'] = df.groupby(['Team', df.groupby('Team')['Amount'].cumsum() // thresh])['Amount'].cumsum()
Upvotes: 2