Reputation: 197
I would like to create a running count of the values in the "Button" column (Start, Stop) within each "ID". However, any change in the "Button" value or a change of "ID" should reset the running count. The data frame is below:
data = pd.DataFrame({
'ID': ['A','A','B','B','C','C','C','C','C','D','E','E'],
'Button': ['Start','Stop','Start','Stop','Start','Start','Stop','Start','Stop','Start','Start','Stop']
})
I can create a running count based on the "Button" values, but can't figure out how to group that by "ID".
data['runningCount'] = data.groupby(data['Button']).cumcount()+1
I'm looking for the following result:
result = pd.DataFrame({
'ID': ['A','A','B','B','C','C','C','C','C','D','E','E'],
'Button': ['Start','Stop','Start','Stop','Start','Start','Stop','Start','Stop','Start','Start','Stop'],
'Count': [1,1,1,1,1,2,1,1,1,1,1,1]})
Upvotes: 3
Views: 813
Reputation: 323316
You just need to create a specific key with shift
s=data.groupby('ID').Button.apply(lambda x : (x!=x.shift()).cumsum())
data.groupby([data.ID,s]).cumcount()+1
Out[189]:
0 1
1 1
2 1
3 1
4 1
5 2
6 1
7 1
8 1
9 1
10 1
11 1
dtype: int64
More info
data.groupby('ID').Button.apply(lambda x : (x!=x.shift()).cumsum())
Out[192]:
0 1
1 2
2 1
3 2
4 1
5 1
6 2
7 3
8 4
9 1
10 1
11 2
Name: Button, dtype: int32
Upvotes: 3