Reputation: 18638
My DataFrame looks like this;
Name,Comp,Dept,Salary,Allowance
Sam,Google,Sales,10000,4500
Sandra,Google,Sales,2300,3450
Allan,Google,Sales,2400,1000
Helen,Google,Mktg,3456,700
Rick,Google,Mktg,5412,352
Farrow,Apple,Sales,9786,190
Isaac,Apple,Sales,4500,230
Tim,Apple,Mktg,4500,500
Ben,Apple,Mktg,3490,450
Julie,Apple,Mktg,4590,750
I would like to get 2 outputs in the following format;
Output 1: - This is the mean of Salary by dept for the 2 companies
Dept Google Apple
Sales 4900 7143
Mktg 4434 4193
Output 2 - This is the value of the maximum paid person by dept for 'Apple' & 'Google'
Dept Google Apple
Sales 10000 9786
Mktg 5412 4590
I've reached so far;
data.groupby(['Dept','Comp'])['Salary'].mean().order(ascending=False)
Dept Comp
Sales Apple 7143.000000
Google 4900.000000
Mktg Google 4434.000000
Apple 4193.333333
Name: Salary, dtype: float64
How do I pivot the results to get the formats mentioned above?
Upvotes: 0
Views: 276
Reputation: 28946
Use unstack
.
In [111]: means = df.groupby(['Comp', 'Dept'])['Salary'].mean()
In [112]: means.unstack(0).sort(ascending=False)
Out[112]:
Comp Apple Google
Dept
Sales 7143.000000 4900
Mktg 4193.333333 4434
[2 rows x 2 columns]
In [110]: df.groupby(['Comp', 'Dept'])['Salary'].max().unstack(0).sort(ascending=False)
Out[110]:
Comp Apple Google
Dept
Sales 9786 10000
Mktg 4590 5412
[2 rows x 2 columns]
Upvotes: 1