Reputation: 850
I have dataframe where I want to keep on increasing the value until val
changes and when id
changes reset the count
value
data = [['p1','1'],
['p1','1'],
['p1','2'],
['p2','3'],
['p2','5'],
['p3','2'],
['p2','3'],
['p2','4']]
df = pd.DataFrame(data = data,columns = ['id','val'])
Desired output
id val count
0 p1 1 1
1 p1 1 1
2 p1 2 2
3 p2 3 1
4 p2 5 2
5 p3 2 1
6 p2 3 1
7 p2 4 2
When I am doing groupby than transform
df['count']=df.groupby('id').val.transform(lambda x : x.factorize()[0]+1)
if you see the output only problem is the last row you will see count 3 as it is group by it is summing with the previous p2 but i want to ignore that and let reset completely and count increment consecutively rather than whole groupby.
id val count
0 p1 1 1
1 p1 1 1
2 p1 2 2
3 p2 3 1
4 p2 5 2
5 p3 2 1
6 p2 3 1
7 p2 4 3
Upvotes: 0
Views: 32
Reputation: 863166
You can pass custom Series created by compare by Series.ne
for not equal with Series.shift
and cumulative sum by Series.cumsum
:
g = df['id'].ne(df['id'].shift()).cumsum()
df['count']=df.groupby(g).val.transform(lambda x : x.factorize()[0]+1)
print (df)
id val count
0 p1 1 1
1 p1 1 1
2 p1 2 2
3 p2 3 1
4 p2 5 2
5 p3 2 1
6 p2 3 1
7 p2 4 2
Upvotes: 1