Reputation: 1359
suppose the following DataFrame is given:
df
step
0 1.0
1 1.0
2 1.0
3 2.0
4 2.0
5 3.0
6 4.0
7 1.0
8 1.0
9 2.0
10 3.0
I now want to "cluster" the data based on the occurency of step==1.0
and increment a counter if that condition is met.
Desired outcome is:
df_count
step count
0 1.0 1
1 1.0 1
2 1.0 1
3 2.0 1
4 2.0 1
5 3.0 1
6 4.0 1
7 1.0 2
8 1.0 2
9 2.0 2
10 3.0 2
Can you come up with any pandas pipeline do achieve this? Thanks in advance
Upvotes: 1
Views: 688
Reputation: 863651
You can test 1
values and also first consecutives, last add cumulative sum for counter:
df['new'] = (df['step'].eq(1.0) & df['step'].ne(df['step'].shift())).cumsum()
print (df)
step new
0 1.0 1
1 1.0 1
2 1.0 1
3 2.0 1
4 2.0 1
5 3.0 1
6 4.0 1
7 1.0 2
8 1.0 2
9 2.0 2
10 3.0 2
Upvotes: 4