Ramon Ankersmit
Ramon Ankersmit

Reputation: 109

Getting max values from pandas multiindex dataframe

Im trying to retrieve only the max values (including the multi index values) from a pandas dataframe that has multiple indexes. The dataframe I have is generated via a groupby and column selection ('tOfmAJyI') like this:

df.groupby('id')['tOfmAJyI'].value_counts()

Out[4]: 
id     tOfmAJyI
3      mlNXN       4
       SSvEP       2
       hCIpw       2
5      SSvEP       2
       hCIpw       1
       mlNXN       1
11     mlNXN       2
       SSvEP       1
...

What I would like to achieve is to get the max values including their corresponding index values. So something like:

id     tOfmAJyI
3      mlNXN       4
5      SSvEP       2
11     mlNXN       2
...

Any ideas how I can achieve this? I was able to get the id and max value but I'm still trying to get the corresponding value of 'tOfmAJyI'.

Upvotes: 11

Views: 8281

Answers (4)

Bertil Johannes Ipsen
Bertil Johannes Ipsen

Reputation: 1776

If you don't have previously sorted values, I think the best general answer is this variation of the one by ffggk which avoids duplicate index.

df.groupby(level=0, group_keys=False).nlargest(1)

Example:

>> df

id  tOfmAJyI
3   mlNXN       4
    SSvEP       2
    hCIpw       2
5   SSvEP       2
    hCIpw       1
    mlNXN       1
11  mlNXN       2
    SSvEP       1
Name: val, dtype: int64


>> df.groupby(level=0, group_keys=False).nlargest(1)

id  tOfmAJyI
3   mlNXN       4
5   SSvEP       2
11  mlNXN       2
Name: val, dtype: int64

Upvotes: 0

ffgg
ffgg

Reputation: 174

I had a similar question and I don't think currently this question has a good answer.

My solution was this, I think it's cleaner:

df.groupby(level=0).nlargest(1)

this keeps the multiindex object and doesn't need a lambda function

Upvotes: 0

Spirit_Gate
Spirit_Gate

Reputation: 41

I can't understand why the practical solution for this problem isn't mentioned anywhere!?

Just do this:

For DataFrame DF with Keys KEY1,KEY2 where you want the max value for every KEY1, including KEY2:

DF.groupby('KEY1').apply(lambda x: x.max())

And you'll get the maximum for each KEY1 INCLUDING the Information which KEY2 holds the maximum, relative to each KEY1.

Upvotes: 4

BENY
BENY

Reputation: 323396

groupby + head

df.groupby(level=0).head(1)
Out[1882]: 
id  tOfmAJyI
3   mlNXN       4
5   SSvEP       2
11  mlNXN       2
Name: V, dtype: int64

Or

df.loc[df.groupby(level=0).idxmax()]
Out[1888]: 
id  tOfmAJyI
3   mlNXN       4
5   SSvEP       2
11  mlNXN       2
Name: V, dtype: int64

Upvotes: 11

Related Questions