Reputation: 207
I have a table like this
id id2 val
a red apple
a red orange
b blue fish
c violet beef
a yellow banana
a black pork
I want to create a counter based on the values of id and id2.
For example, id is the customer Id and id2 is the order Id and val is the item name in the order. I want to create a counter for the customer id and order id. So that's why the first 2 rows should have the same counter 1
I tried the cumcount but they will give me something likes this
id id2 val counter
a red apple 1
a red orange 2
b blue fish 1
c violet beef 1
a yellow banana 1
a black pork 1
I tried the cumcount
df['counter'] = df.groupby(['id','id2']).cumcount() + 1
but it is not what I want
I want a result table like this
id id2 val counter
a red apple 1
a red orange 1
b blue fish 1
c violet beef 1
a yellow banana 2
a black pork 3
Thank you,
Upvotes: 3
Views: 485
Reputation: 18647
You can use groupby.transform
and pandas.factorize
:
df['counter'] = df.groupby(['id'])['id2'].transform(lambda x: pd.factorize(x)[0]).add(1)
id id2 val counter
0 a red apple 1
1 a red orange 1
2 b blue fish 1
3 c violet beef 1
4 a yellow banana 2
5 a black pork 3
Upvotes: 3
Reputation: 150805
Looks like you are trying to count the unique values of pairs (id, id2)
:
uniques = df.drop_duplicates(subset=['id','id2']).groupby('id').cumcount()+1
df['counter'] = uniques.reindex(df.index).ffill().astype(int)
Upvotes: 5