Sam
Sam

Reputation: 141

Sort GroupBy object by certain max. value within individual groups

I am trying to sort my groupby object by the highest value for a certain year - i.e. the 2018 values. However, unsuccessfully.
Code:

aggs = {'sales':'sum')
df.groupby(by=['segment', 'year'].agg(aggs)

Default result by pandas when grouping
(sorted alphabetically by Level0, then ascending by Level1)

Segment Year Sales  
A 2016 2  
A 2017 10  
A 2018 6  
B 2016 1  
B 2017 4  
B 2018 8

Expected result:

Segment Year Sales  
B 2016 1  
B 2017 4  
B 2018 8  
A 2016 2  
A 2017 10  
A 2018 6

i.e. A is sorted behind B, because sum of B in 2018 is 8 while for A it is 6.

Upvotes: 1

Views: 62

Answers (1)

jezrael
jezrael

Reputation: 862511

Idea is create ordered Categorical with categories by filtered values with 2018 and sorted by Sales:

cats = df[df['Year'] == 2018].sort_values('Sales', ascending=False)['Segment']

aggs = {'Sales':'sum'}
df['Segment'] = pd.Categorical(df['Segment'], ordered=True, categories=cats)
df1 = df.groupby(by=['Segment', 'Year']).agg(aggs)

print (df1)
              Sales
Segment Year       
B       2016      1
        2017      4
        2018      8
A       2016      2
        2017     10
        2018      6

Upvotes: 2

Related Questions