Python_learner
Python_learner

Reputation: 43

groupby and withhold information of one column based on value of another column

I have the following DataFrame consisting out of columns id, brand and count

Id  brand  count
1   Audi   3 
2   BWM    5 
2   FORD   3
3   AUDI   7
4   BMW    2
5   Audi   4
5   FORD   3 

I would like to groupby id and only remain each id with the brand that has the highest count. So in the end I would like to have the following:

id  brand
1   AUDI
2   BMW
3   AUDI
4   BMW
5   AUDI

I have something like this but that obviously is not working. So what would be the correct function or syntax to accomplish that? Thanks!

data.groupby('id')['brand'].where(max('count'))

Upvotes: 4

Views: 52

Answers (2)

BENY
BENY

Reputation: 323326

IIUC

df=df.sort_values(['Id','count']).drop_duplicates('Id',keep='last')
Out[249]: 
   Id brand  count
0   1  Audi      3
1   2   BWM      5
3   3  AUDI      7
4   4   BMW      2
5   5  Audi      4

Upvotes: 1

Chris Adams
Chris Adams

Reputation: 18647

IIUC use groupby.idxmax and loc:

df.loc[df.groupby('Id')['count'].idxmax()]

[out]

   Id brand  count
0   1  Audi      3
1   2   BWM      5
3   3  AUDI      7
4   4   BMW      2
5   5  Audi      4

Upvotes: 3

Related Questions