Sara
Sara

Reputation: 973

Extract rows with maximum values in pandas dataframe

We can use .idxmax to get the maximum value of a dataframe­(df). My problem is that I have a df with several columns (more than 10), one of a column has identifiers of same value. I need to extract the identifiers with the maximum value:

>df

id  value
a   0
b   1
b   1
c   0
c   2
c   1

Now, this is what I'd want:

>df

id  value
a   0
b   1
c   2

I am trying to get it by using df.groupy(['id']), but it is a bit tricky:

df.groupby(["id"]).ix[df['value'].idxmax()]

Of course, that doesn't work. I fear that I am not on the right path, so I thought I'd ask you guys! Thanks!

Upvotes: 1

Views: 2688

Answers (1)

Jeff
Jeff

Reputation: 129048

Close! Groupby the id, then use the value column; return the max for each group.

In [14]: df.groupby('id')['value'].max()
Out[14]: 
id
a     0
b     1
c     2
Name: value, dtype: int64

Op wants to provide these locations back to the frame, just create a transform and assign.

In [17]: df['max'] = df.groupby('id')['value'].transform(lambda x: x.max())

In [18]: df
Out[18]: 
  id  value  max
0  a      0    0
1  b      1    1
2  b      1    1
3  c      0    2
4  c      2    2
5  c      1    2

Upvotes: 5

Related Questions