Reputation: 1354
Given an example df like below, I want to find a increment counter of all unique instances of val
. The closest that I've gotten is df.groupby('val').cumcount()
but obviously this isn't what I want.
df = pd.DataFrame({'val': [100, 101, 104, 104, 106, 108, 108, 108]})
Desired result:
val ctr
0 100 1
1 101 2
2 104 3
3 104 0
4 106 4
5 108 5
6 108 0
7 108 0
Upvotes: 0
Views: 86
Reputation: 35686
We could use groupby ngroup
to enumerate groups (sort=False
) if wanting groups enumerated the way the appear in the DataFrame, then mask
out the duplicated
values:
s = df.groupby('val', sort=False).ngroup() + 1 # Get unique group number
df['ctr'] = s.mask(s.duplicated(), 0) # Add in the 0s
df
:
val ctr
0 100 1
1 101 2
2 104 3
3 104 0
4 106 4
5 108 5
6 108 0
7 108 0
Or with pd.factorize
and np.where
to assign duplicated
values to 0:
import numpy as np
m = df['val'].duplicated()
df['ctr'] = np.where(m, 0, pd.factorize(df['val'])[0] + 1)
df
:
val ctr
0 100 1
1 101 2
2 104 3
3 104 0
4 106 4
5 108 5
6 108 0
7 108 0
Upvotes: 1
Reputation: 23227
If your sequence of 'Val` are in sorted order, you can use:
m = df['val'].ne(df['val'].shift())
df['ctr'] = np.where(m, m.cumsum(), 0)
Result:
print(df)
val ctr
0 100 1
1 101 2
2 104 3
3 104 0
4 106 4
5 108 5
6 108 0
7 108 0
Upvotes: 1