Runing Count with rest condition on another row consecutively

Question

I have dataframe where I want to keep on increasing the value until val changes and when id changes reset the count value

data = [['p1','1'],
        ['p1','1'],
        ['p1','2'],
        ['p2','3'],
        ['p2','5'],
        ['p3','2'],
        ['p2','3'],
        ['p2','4']]

df = pd.DataFrame(data = data,columns = ['id','val'])

Desired output

   id val  count
0  p1   1      1
1  p1   1      1
2  p1   2      2
3  p2   3      1
4  p2   5      2
5  p3   2      1
6  p2   3      1
7  p2   4      2

When I am doing groupby than transform

df['count']=df.groupby('id').val.transform(lambda x : x.factorize()[0]+1)

if you see the output only problem is the last row you will see count 3 as it is group by it is summing with the previous p2 but i want to ignore that and let reset completely and count increment consecutively rather than whole groupby.

   id val  count
0  p1   1      1
1  p1   1      1
2  p1   2      2
3  p2   3      1
4  p2   5      2
5  p3   2      1
6  p2   3      1
7  p2   4      3

jezrael · Accepted Answer

You can pass custom Series created by compare by Series.ne for not equal with Series.shift and cumulative sum by Series.cumsum:

g = df['id'].ne(df['id'].shift()).cumsum()
df['count']=df.groupby(g).val.transform(lambda x : x.factorize()[0]+1)

print (df)
   id val  count
0  p1   1      1
1  p1   1      1
2  p1   2      2
3  p2   3      1
4  p2   5      2
5  p3   2      1
6  p2   3      1
7  p2   4      2

Runing Count with rest condition on another row consecutively

Answers (1)

Related Questions