Reputation: 141
I am trying to sort my groupby object by the highest value for a certain year - i.e. the 2018 values. However, unsuccessfully.
Code:
aggs = {'sales':'sum')
df.groupby(by=['segment', 'year'].agg(aggs)
Default result by pandas when grouping
(sorted alphabetically by Level0, then ascending by Level1)
Segment Year Sales
A 2016 2
A 2017 10
A 2018 6
B 2016 1
B 2017 4
B 2018 8
Expected result:
Segment Year Sales
B 2016 1
B 2017 4
B 2018 8
A 2016 2
A 2017 10
A 2018 6
i.e. A is sorted behind B, because sum of B in 2018 is 8 while for A it is 6.
Upvotes: 1
Views: 62
Reputation: 862511
Idea is create ordered Categorical
with categories by filtered values with 2018
and sorted by Sales
:
cats = df[df['Year'] == 2018].sort_values('Sales', ascending=False)['Segment']
aggs = {'Sales':'sum'}
df['Segment'] = pd.Categorical(df['Segment'], ordered=True, categories=cats)
df1 = df.groupby(by=['Segment', 'Year']).agg(aggs)
print (df1)
Sales
Segment Year
B 2016 1
2017 4
2018 8
A 2016 2
2017 10
2018 6
Upvotes: 2