Reputation: 2893
There are many queries and answers related to conditional cumulative sum (Cumsum Reset based on a condition in Pandas, Reset Cumulative sum base on condition Pandas, Cumsum Reset based on a condition in Pandas). But I am not able to solve the problem I face. Below is the part of data I have and the requiremet is to keep a count of change in 'type' and corresponding cumulative sum.
type sale
y 10
y 20
y 5
n 30
n 20
n 5
y 10
y 40
y 15
My requirement is to get a serial count of change in type and cumulative sale as below.
type sale tp_cum cum_sale
y 10 1 10
y 20 1 30
y 5 1 35
n 30 2 30
n 20 2 50
n 5 2 55
y 10 3 10
y 40 3 50
y 15 3 65
I tried various modifications of the code below but is not exactly meeting the requirements. Please help.
sales['cum_sale'] = stock.groupby('type')['sale'].cumsum()
Dataframe:
df = pd.DataFrame([["y",10 ],
["y",20 ],
["y",5 ],
["n",30 ],
["n",20 ],
["n",5 ],
["y",10 ],
["y",40 ],
["y",15 ]],columns = ["type","sale"])
Upvotes: 3
Views: 1425
Reputation: 1058
Here is an option, you first create the tp_cum
column and then cumsum()
import pandas as pd
import numpy as np
df = pd.DataFrame([["y",10 ],
["y",20 ],
["y",5 ],
["n",30 ],
["n",20 ],
["n",5 ],
["y",10 ],
["y",40 ],
["y",15 ]],columns = ["type","sale"])
df["type2"] = np.cumsum((df["type"] != df["type"].shift(1)))
df["cum_sale"] = df[["sale","type2"]].groupby("type2").cumsum()
df
Output:
type sale type2 cum_sale
0 y 10 1 10
1 y 20 1 30
2 y 5 1 35
3 n 30 2 30
4 n 20 2 50
5 n 5 2 55
6 y 10 3 10
7 y 40 3 50
8 y 15 3 65
Upvotes: 4