Reputation: 21961
I have the foll. dataframe:
tot_pix Season caap_col lamma kite
datetime
2000-01-01 1914.0 2000.0 1.0 1.95025 117.737362
2000-01-04 1914.0 2000.0 1.0 1.95025 117.674177
2000-01-05 1914.0 2000.0 1.0 1.95025 117.995489
2001-01-04 1914.0 2001.0 1.0 1.95025 118.114809
2001-01-05 1914.0 2001.0 1.0 1.95025 118.160295
In the dataframe above, I want to compute cumulative sum of the kite
column. However, I want the cumulative sum to NOT span across Season
values. E.g. the output with cumulative sum for kite
should look like so:
tot_pix Season caap_col lamma kite
datetime
1/1/2000 1914 2000 1 1.95025 117.737362
1/4/2000 1914 2000 1 1.95025 235.411539
1/5/2000 1914 2000 1 1.95025 235.669666
1/4/2001 1914 2001 1 1.95025 118.114809
1/5/2001 1914 2001 1 1.95025 236.275104
I can compute cumulative sum using the cumsum
command, how do I restrict it to specific spans of Season
?
Upvotes: 2
Views: 1286
Reputation: 862521
#if not sorted index with column Season
#df = df.sort_index(sort_remaining=True).sort_values('Season')
df['kite'] = df.groupby('Season')['kite'].cumsum()
print (df)
tot_pix Season caap_col lamma kite
datetime
2000-01-01 1914.0 2000.0 1.0 1.95025 117.737362
2000-01-04 1914.0 2000.0 1.0 1.95025 235.411539
2000-01-05 1914.0 2000.0 1.0 1.95025 353.407028
2001-01-04 1914.0 2001.0 1.0 1.95025 118.114809
2001-01-05 1914.0 2001.0 1.0 1.95025 236.275104
Upvotes: 3