Pandas groupby - count unique into separate columns per group

Question

I'm trying to do a groupby where in one column I have string categorical data:

ID   cat_1   cat_2
11   'OG'    'ASD'
11   'LOL'   'ASD' 
11   'OG'    'DFG' 
22   'LOL'   'DFG'
22   'OG'    'DFG'

And I'm trying to group by the ID, and aggregate the string data into a numeric feature, namely, the counts of occurrences for each category. So the outcome would be:

ID  OG  LOL  ASD  DFG
11   2    1    2    1
22   1    1    0    2

How can I achieve this in pandas? Thank you!

not_speshal · Accepted Answer

You could use pd.get_dummies with groupby and stack:

>>> pd.get_dummies(df.set_index("ID").stack()).groupby("ID").sum()
    ASD  DFG  LOL  OG
ID                   
11    2    1    1   2
22    0    2    1   1

Pandas groupby - count unique into separate columns per group

Answers (2)

Related Questions