groupby and withhold information of one column based on value of another column

Question

I have the following DataFrame consisting out of columns id, brand and count

Id  brand  count
1   Audi   3 
2   BWM    5 
2   FORD   3
3   AUDI   7
4   BMW    2
5   Audi   4
5   FORD   3

I would like to groupby id and only remain each id with the brand that has the highest count. So in the end I would like to have the following:

id  brand
1   AUDI
2   BMW
3   AUDI
4   BMW
5   AUDI

I have something like this but that obviously is not working. So what would be the correct function or syntax to accomplish that? Thanks!

data.groupby('id')['brand'].where(max('count'))

Chris Adams · Accepted Answer

IIUC use groupby.idxmax and loc:

df.loc[df.groupby('Id')['count'].idxmax()]

[out]

   Id brand  count
0   1  Audi      3
1   2   BWM      5
3   3  AUDI      7
4   4   BMW      2
5   5  Audi      4

Answers (2)