Hai Vu
Hai Vu

Reputation: 207

Add conditional counter: counter column based on value of other columns

I have a table like this

id    id2      val
a     red      apple
a     red      orange
b     blue     fish
c     violet   beef
a     yellow   banana
a     black    pork

I want to create a counter based on the values of id and id2.

For example, id is the customer Id and id2 is the order Id and val is the item name in the order. I want to create a counter for the customer id and order id. So that's why the first 2 rows should have the same counter 1

I tried the cumcount but they will give me something likes this

id    id2      val      counter
a     red      apple    1
a     red      orange   2
b     blue     fish     1
c     violet   beef     1
a     yellow   banana   1
a     black    pork     1

I tried the cumcount

df['counter'] = df.groupby(['id','id2']).cumcount() + 1

but it is not what I want

I want a result table like this

id    id2      val      counter
a     red      apple    1
a     red      orange   1
b     blue     fish     1
c     violet   beef     1
a     yellow   banana   2
a     black    pork     3

Thank you,

Upvotes: 3

Views: 485

Answers (2)

Chris Adams
Chris Adams

Reputation: 18647

You can use groupby.transform and pandas.factorize:

df['counter'] = df.groupby(['id'])['id2'].transform(lambda x: pd.factorize(x)[0]).add(1)

  id     id2     val  counter
0  a     red   apple        1
1  a     red  orange        1
2  b    blue    fish        1
3  c  violet    beef        1
4  a  yellow  banana        2
5  a   black    pork        3

Upvotes: 3

Quang Hoang
Quang Hoang

Reputation: 150805

Looks like you are trying to count the unique values of pairs (id, id2):

uniques = df.drop_duplicates(subset=['id','id2']).groupby('id').cumcount()+1
df['counter'] = uniques.reindex(df.index).ffill().astype(int)

Upvotes: 5

Related Questions