How to count appearances of a value until it changes to another one?

Question

I have a pandas dataframe called df. In this dataframe, I got one variable called value. I want to add a variable to count appearances of a same value until it changes to another one. Let's call this new variable count.

My dataframe looks like that:

import pandas as pd
import numpy as np

ar = np.array([[1], [1], [2],[2], [3], [3], [1], [1], [2], [2]])
df = pd.DataFrame(ar,  columns = ['Value'])

print(df)

   Value
0      1
1      1
2      2
3      2
4      3
5      3
6      1
7      1
8      2
9      2

I tried this code:

df['count'] = df.groupby('Value').cumcount() + 1

Which returns:

print(df)
   Value  count
0      1      1
1      1      2
2      2      1
3      2      2
4      3      1
5      3      2
6      1      3
7      1      4
8      2      3
9      2      4

I expect something like this:

print(df)
   Value  count
0      1      1
1      1      2
2      2      1
3      2      2
4      3      1
5      3      2
6      1      1
7      1      2
8      2      1
9      2      2

Is there a way to get that output?

anky · Accepted Answer

IIUC, use:

df=df.assign(count=df.groupby(df.Value.ne(df.Value.shift()).cumsum()).cumcount().add(1))

   Value  count
0      1      1
1      1      2
2      2      1
3      2      2
4      3      1
5      3      2
6      1      1
7      1      2
8      2      1
9      2      2

Where:

print(df.Value.ne(df.Value.shift()))

0     True
1    False
2     True
3    False
4     True
5    False
6     True
7    False
8     True
9    False
Name: Value, dtype: bool

How to count appearances of a value until it changes to another one?

Answers (2)

Related Questions