Reputation: 117
I have a data frame like this.
df = pd.DataFrame({'transaction_id':[12565,12565,12743,12743,13456,13456,13856],
'rep_id':[560,560,560,560,287,287,287]})
I want to create a new column with a counter like below.
transaction_id rep_id trans_num
0 12565 560 1
1 12565 560 1
2 12743 560 2
3 12743 560 2
4 13456 287 1
5 13456 287 1
6 13856 287 2
Upvotes: 2
Views: 48
Reputation: 150805
Based on your data (transaction_id
are different if rep_id
are different) we can also do:
df['new'] = (df['transaction_id'].ne(df['transaction_id'].shift())
.groupby(df['rep_id']).cumsum()
)
Update: You can also use rank
, although it behaves a bit differently:
df.groupby('rep_id')['transaction_id'].rank('dense').astype(int)
Output:
transaction_id rep_id new
0 12565 560 1
1 12565 560 1
2 12743 560 2
3 12743 560 2
4 13456 287 1
5 13456 287 1
6 13856 287 2
Upvotes: 2
Reputation: 323376
Try with transform
+ factorize
df['new']=df.groupby('rep_id').transaction_id.transform(lambda x : pd.factorize(x)[0]+1)
df
Out[389]:
transaction_id rep_id new
0 12565 560 1
1 12565 560 1
2 12743 560 2
3 12743 560 2
4 13456 287 1
5 13456 287 1
6 13856 287 2
Upvotes: 8