Reputation: 39
I have a pd.DataFrame
that is the result of .groupby(['Product', 'Salesperson']).sum()
. I would now like to sort the Product column in order of Sales by Product (not Sales by Product and Salesperson). Then inside each Product group sort by each Salesperson's Sales.
Here is my starting df
:
And here is my desired answer a1
with some notes to clarify the ordering process:
Below is my sample df
and my desired answer a1
with a simple assertion test.
import pandas as pd
from pandas.util.testing import assert_frame_equal
import numpy as np
s1 = {'Product': {0: 'Soap',
1: 'Soap',
2: 'Pencil',
3: 'Paper',
4: 'Paper',
5: 'Bags',
6: 'Bags'},
'Salesperson': {0: 'Jack',
1: 'Jill',
2: 'Jill',
3: 'Jack',
4: 'Barry',
5: 'Barry',
6: 'Jack'},
'Sales': {0: 40, 1: 20, 2: 500, 3: 50, 4: 10, 5: 450, 6: 100}}
a1 = {'Product': {0: 'Bags',
1: 'Bags',
2: 'Pencil',
3: 'Paper',
4: 'Paper',
5: 'Soap',
6: 'Soap'},
'Salesperson': {0: 'Barry',
1: 'Jack',
2: 'Jill',
3: 'Jack',
4: 'Barry',
5: 'Jack',
6: 'Jill'},
'Sales': {0: 450, 1: 100, 2: 500, 3: 50, 4: 10, 5: 40, 6: 20}}
df = pd.DataFrame(s1).set_index(['Product', 'Salesperson']) # sample
a1 = pd.DataFrame(a1).set_index(['Product', 'Salesperson']) # desired answer
print(df)
print(a1)
def my_sort(df):
raise NotImplementedError
my_answer = my_sort(df)
assert_frame_equal(my_answer, a1)
Upvotes: 1
Views: 46
Reputation: 17884
You can groupby
by 'Product'
and create the dummy columns 'sum'
and 'max'
, which you can use by sorting:
g = df.groupby('Product')['Sales']
df['sum'] = g.transform('sum')
df['max'] = g.transform('max')
df.sort_values(['sum', 'max', 'Sales'], ascending=False)\
.drop(['sum', 'max'], axis=1)
Output:
Sales
Product Salesperson
Bags Barry 450
Jack 100
Pencil Jill 500
Paper Jack 50
Barry 10
Soap Jack 40
Jill 20
Upvotes: 2