Reputation: 87
I have a pandas dataframe called df
. In this dataframe, I got one variable called value
. I want to add a variable to count appearances of a same value until it changes to another one. Let's call this new variable count
.
My dataframe looks like that:
import pandas as pd
import numpy as np
ar = np.array([[1], [1], [2],[2], [3], [3], [1], [1], [2], [2]])
df = pd.DataFrame(ar, columns = ['Value'])
print(df)
Value
0 1
1 1
2 2
3 2
4 3
5 3
6 1
7 1
8 2
9 2
I tried this code:
df['count'] = df.groupby('Value').cumcount() + 1
Which returns:
print(df)
Value count
0 1 1
1 1 2
2 2 1
3 2 2
4 3 1
5 3 2
6 1 3
7 1 4
8 2 3
9 2 4
I expect something like this:
print(df)
Value count
0 1 1
1 1 2
2 2 1
3 2 2
4 3 1
5 3 2
6 1 1
7 1 2
8 2 1
9 2 2
Is there a way to get that output?
Upvotes: 1
Views: 688
Reputation: 315
Though @anky_91 answer is perfect,a naive solution can be to create a function count_upto
without using the methods discussed in his answer.
def count_upto(series):
count = np.ones(len(series),np.int32)
for i in range(1,len(series)):
word=series[i]
if word == series[i-1]:
count[i] = count[i-1] +1
return count
df['count']=count_upto(df.Value.values)
print(df)
>>>
Value c
0 1 1
1 1 2
2 1 3
3 2 1
4 3 1
5 3 2
6 1 1
7 1 2
8 2 1
9 2 2
Upvotes: 0
Reputation: 75080
IIUC, use:
df=df.assign(count=df.groupby(df.Value.ne(df.Value.shift()).cumsum()).cumcount().add(1))
Value count
0 1 1
1 1 2
2 2 1
3 2 2
4 3 1
5 3 2
6 1 1
7 1 2
8 2 1
9 2 2
Where:
print(df.Value.ne(df.Value.shift()))
0 True
1 False
2 True
3 False
4 True
5 False
6 True
7 False
8 True
9 False
Name: Value, dtype: bool
Upvotes: 6