Kunal Goel
Kunal Goel

Reputation: 21

Select corresponding column value for max value of separate column(from a specific range of column) of pandas data frame

        year    month       quantity
DateNew         
2005-01 2005    January     49550
2005-02 2005    February    96088
2005-03 2005    March       28874
2005-04 2005    April       66917
2005-05 2005    May         24070
... ... ... ...
2018-08 2018    August      132629
2018-09 2018    September   104394
2018-10 2018    October     121305
2018-11 2018    November    121049
2018-12 2018    December    174984

This is the data frame that I have. I want to select the maximum quantity for each year and return the corresponding month for it.

I have tried this so far

df.groupby('year').max()

But in this, I get the max value for each and every column and hence getting September in each year. I have no clue how to approach the actual solution.

Upvotes: 2

Views: 79

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150745

I think you want idxmax:

df.loc[df.groupby('year')['quantity'].idxmax()]

Output:

         year     month  quantity
DateNew                          
2005-02  2005  February     96088
2018-12  2018  December    174984

Or just for the months:

df.loc[df.groupby('year')['quantity'].idxmax(), 'month']

Output:

DateNew
2005-02    February
2018-12    December
Name: month, dtype: object

Also, you can use sort_values followed by duplicated:

df.loc[~df.sort_values('quantity').duplicated('year', keep='last'), 'month']

Upvotes: 2

Related Questions