Reputation: 109
Im trying to retrieve only the max values (including the multi index values) from a pandas dataframe that has multiple indexes. The dataframe I have is generated via a groupby and column selection ('tOfmAJyI') like this:
df.groupby('id')['tOfmAJyI'].value_counts()
Out[4]:
id tOfmAJyI
3 mlNXN 4
SSvEP 2
hCIpw 2
5 SSvEP 2
hCIpw 1
mlNXN 1
11 mlNXN 2
SSvEP 1
...
What I would like to achieve is to get the max values including their corresponding index values. So something like:
id tOfmAJyI
3 mlNXN 4
5 SSvEP 2
11 mlNXN 2
...
Any ideas how I can achieve this? I was able to get the id and max value but I'm still trying to get the corresponding value of 'tOfmAJyI'.
Upvotes: 11
Views: 8281
Reputation: 1776
If you don't have previously sorted values, I think the best general answer is this variation of the one by ffggk which avoids duplicate index.
df.groupby(level=0, group_keys=False).nlargest(1)
Example:
>> df
id tOfmAJyI
3 mlNXN 4
SSvEP 2
hCIpw 2
5 SSvEP 2
hCIpw 1
mlNXN 1
11 mlNXN 2
SSvEP 1
Name: val, dtype: int64
>> df.groupby(level=0, group_keys=False).nlargest(1)
id tOfmAJyI
3 mlNXN 4
5 SSvEP 2
11 mlNXN 2
Name: val, dtype: int64
Upvotes: 0
Reputation: 174
I had a similar question and I don't think currently this question has a good answer.
My solution was this, I think it's cleaner:
df.groupby(level=0).nlargest(1)
this keeps the multiindex object and doesn't need a lambda function
Upvotes: 0
Reputation: 41
I can't understand why the practical solution for this problem isn't mentioned anywhere!?
Just do this:
For DataFrame DF with Keys KEY1,KEY2 where you want the max value for every KEY1, including KEY2:
DF.groupby('KEY1').apply(lambda x: x.max())
And you'll get the maximum for each KEY1 INCLUDING the Information which KEY2 holds the maximum, relative to each KEY1.
Upvotes: 4
Reputation: 323396
groupby
+ head
df.groupby(level=0).head(1)
Out[1882]:
id tOfmAJyI
3 mlNXN 4
5 SSvEP 2
11 mlNXN 2
Name: V, dtype: int64
Or
df.loc[df.groupby(level=0).idxmax()]
Out[1888]:
id tOfmAJyI
3 mlNXN 4
5 SSvEP 2
11 mlNXN 2
Name: V, dtype: int64
Upvotes: 11