Reputation: 912
Sample Data
date_rng = pd.date_range('2019-01-01', freq='s', periods=400)
df = pd.DataFrame(np.random.lognormal(.005, .5, size=(len(date_rng))),
columns=['data1'],
index=date_rng)
Example input
data1
2019-01-01 00:00:00 1
2019-01-01 00:00:01 -2
2019-01-01 00:00:02 2
2019-01-01 00:00:03 3
2019-01-01 00:00:04 1
2019-01-01 00:00:05 2
2019-01-01 00:00:06 -1
2019-01-01 00:00:07 3
2019-01-01 00:00:08 4
2019-01-01 00:00:09 5
2019-01-01 00:00:10 7
2019-01-01 00:00:11 2
2019-01-01 00:00:12 4
2019-01-01 00:00:13 -1
2019-01-01 00:00:14 5
2019-01-01 00:00:15 3
2019-01-01 00:00:16 5
2019-01-01 00:00:17 -3
... ...
Expected output
data1 cumsum
2019-01-01 00:00:00 1 1
2019-01-01 00:00:01 -2 -1
2019-01-01 00:00:02 2 1
2019-01-01 00:00:03 3 4
2019-01-01 00:00:04 1 5 (reset cumsum at this point)
2019-01-01 00:00:05 2 2
2019-01-01 00:00:06 -1 1
2019-01-01 00:00:07 3 4
2019-01-01 00:00:08 4 8 (reset at this point)
2019-01-01 00:00:09 5 5 (reset at this point)
2019-01-01 00:00:10 7 7 (reset at this point)
2019-01-01 00:00:11 2 2
2019-01-01 00:00:12 4 6 (reset at this point)
2019-01-01 00:00:13 -1 -1
2019-01-01 00:00:14 5 4
2019-01-01 00:00:15 3 7 (reset at this point)
2019-01-01 00:00:16 1 1
2019-01-01 00:00:17 -3 -2
... ...
I want to calculate the cumulative sum of the data1
based on the condtiontion that it will reset if it larger than 5 and exceed 20% of the last reset. For example: for the first calculation, it will only reset if it's larger than 5, after that, it will reset based on both conditions.
I have checked out some other answers on stackoverflow but I didn't find similar problems. Please advice me on how to solve this problem.
Upvotes: 2
Views: 140
Reputation: 2495
You can use a simple loop
with if else
and some logic to solve this problem.
ls = [] # store result
cumsum = 0
last_reset = 0
for _, row in df.iterrows():
cumsum = cumsum + row.data1
ls.append(cumsum)
if cumsum >= 5 and cumsum > 0.2*last_reset:
last_reset = cumsum
cumsum = 0
df['cumsum'] = ls
df
data1 cumsum
2019-01-01 00:00:00 1 1
2019-01-01 00:00:01 -2 -1
2019-01-01 00:00:02 2 1
2019-01-01 00:00:03 3 4
2019-01-01 00:00:04 1 5
2019-01-01 00:00:05 2 2
2019-01-01 00:00:06 -1 1
2019-01-01 00:00:07 3 4
2019-01-01 00:00:08 4 8
2019-01-01 00:00:09 5 5
2019-01-01 00:00:10 7 7
2019-01-01 00:00:11 2 2
2019-01-01 00:00:12 4 6
2019-01-01 00:00:13 -1 -1
2019-01-01 00:00:14 5 4
2019-01-01 00:00:15 3 7
2019-01-01 00:00:16 1 1
2019-01-01 00:00:17 -3 -2
Upvotes: 1