Reputation: 21
year month quantity
DateNew
2005-01 2005 January 49550
2005-02 2005 February 96088
2005-03 2005 March 28874
2005-04 2005 April 66917
2005-05 2005 May 24070
... ... ... ...
2018-08 2018 August 132629
2018-09 2018 September 104394
2018-10 2018 October 121305
2018-11 2018 November 121049
2018-12 2018 December 174984
This is the data frame that I have. I want to select the maximum quantity for each year and return the corresponding month for it.
I have tried this so far
df.groupby('year').max()
But in this, I get the max value for each and every column and hence getting September in each year. I have no clue how to approach the actual solution.
Upvotes: 2
Views: 79
Reputation: 150745
I think you want idxmax
:
df.loc[df.groupby('year')['quantity'].idxmax()]
Output:
year month quantity
DateNew
2005-02 2005 February 96088
2018-12 2018 December 174984
Or just for the months:
df.loc[df.groupby('year')['quantity'].idxmax(), 'month']
Output:
DateNew
2005-02 February
2018-12 December
Name: month, dtype: object
Also, you can use sort_values
followed by duplicated
:
df.loc[~df.sort_values('quantity').duplicated('year', keep='last'), 'month']
Upvotes: 2