Reputation: 4398
I have a dataframe where I would like to perform cumulative calculations based on certain columns. Needs to be grouped by id and date.
Data
id date pwr start move
aa q1 22 10 4 2
aa q2 22 5 3 1
aa q3 22 7 3 0
aa q4 22 3 3 0
bb q1 22 20 1 0
bb q2 22 10 2 0
bb q3 22 5 2 1
bb q4 22 5 1 1
Logic:
final column values created by:
Desired
id date pwr start move final
aa q1 22 10 4 2 2
aa q2 22 5 3 1 -4
aa q3 22 7 3 0 -11
aa q4 22 3 3 0 -14
bb q1 22 20 1 0 1
bb q2 22 10 2 0 -9
bb q3 22 5 2 1 -15
bb q4 22 5 1 1 -21
Doing
df['final'] = df['start'].sub(df.groupby('id')['date'].cumsum())
df = df.assign(start = df['start'] - df['move'])
Performing cumsum grouping by id and date- however, the values in the final column are not calculating correctly. I am still troubleshooting this, any suggestion is appreciated.
Upvotes: 1
Views: 76
Reputation: 120429
Try:
def f(x):
final = x['start'].sub(x['move']).iloc[0]
return x.shift(-1)[['pwr', 'move']].sum(axis=1).mul(-1) \
.shift(fill_value=final).cumsum()
df['final'] = df.groupby('id').apply(f).astype(int).values
>>> df
id date pwr start move final
0 aa q1 22 10 4 2 2
1 aa q2 22 5 3 1 -4
2 aa q3 22 7 3 0 -11
3 aa q4 22 3 3 0 -14
4 bb q1 22 20 1 0 1
5 bb q2 22 10 2 0 -9
6 bb q3 22 5 2 1 -15
7 bb q4 22 5 1 1 -21
Upvotes: 1