Reputation: 149
I have the following df:
Doc Item
1 1
1 1
1 2
1 3
2 1
2 2
I want to add third column with repeating values that (1) increment by one if there is a change in column "Item" and that also (2) restarts if there is a change in column "Doc"
Doc Item NewCol
1 1 1
1 1 1
1 2 2
1 3 3
2 1 1
2 2 2
What is the best way to achieve this? Thanks a lot.
Upvotes: 1
Views: 65
Reputation: 862651
Use GroupBy.transform
wth custom lambda function with factorize
:
df['NewCol'] = df.groupby('Doc')['Item'].transform(lambda x: pd.factorize(x)[0]) + 1
print (df)
Doc Item NewCol
0 1 1 1
1 1 1 1
2 1 2 2
3 1 3 3
4 2 1 1
5 2 2 2
If values in Item
are integers is possible use GroupBy.rank
:
df['NewCol'] = df.groupby('Doc')['Item'].rank(method='dense').astype(int)
Upvotes: 2