Reputation: 269
How do you output average of multiple columns?
Gender Age Salary Yr_exp cup_coffee_daily
Male 28 45000.0 6.0 2.0
Female 40 70000.0 15.0 10.0
Female 23 40000.0 1.0 0.0
Male 35 55000.0 12.0 6.0
I have df.groupby('Gender', as_index=False)['Age', 'Salary', 'Yr_exp'].mean()
, but it still only returned the average of the first column Age
. How do you return the average of specific columns in different columns? Desired output:
Gender Age Salary Yr_exp
Male 31.5 50000.0 9.0
Female 31.5 55000.0 8.0
Thanks.
Upvotes: 15
Views: 45693
Reputation: 2026
You can also use pandas.agg()
:
df.groupby("Gender").agg({'Age' : 'mean', 'Salary' : 'mean', 'Yr_exp': 'mean'})
Would result to:
Age Salary Yr_exp
Gender
Female 31.5 55000 8
Male 31.5 50000 9
Using .agg()
give you the chance to apply different functions to a grouped object - something like:
df.groupby("Gender").agg({'Age' : 'mean', 'Salary' : ['min', 'max'], 'Yr_exp': 'sum'})
Outputs:
Age Salary Yr_exp
mean min max sum
Gender
Female 31.5 40000 70000 16
Male 31.5 45000 55000 18
Upvotes: 9
Reputation: 592
Given this dataframe:
df = pd.DataFrame({
"Gender": ["Male", "Female", "Female", "Male"],
"Age": [28, 40, 23, 35],
"Salary": [45000, 70000, 40000, 55000],
"Yr_exp": [6, 15, 1, 12]
})
df
Age Gender Salary Yr_exp
0 28 Male 45000 6
1 40 Female 70000 15
2 23 Female 40000 1
3 35 Male 55000 12
Group by gender and use the mean()
function:
df.groupby("Gender").mean()
Age Salary Yr_exp
Gender
Female 31.5 55000.0 8.0
Male 31.5 50000.0 9.0
Edit: you may need to change the way you're indexing after groupby()
: df['Age', 'Salary']
gives a KeyError
, but df[['Age', 'Salary']]
returns the expected:
Age Salary
0 28 45000
1 40 70000
2 23 40000
3 35 55000
Try changing
df.groupby("Gender", as_index=True)['Age', 'Salary', 'Yr_exp'].mean()
to
df.groupby("Gender", as_index=True)[['Age', 'Salary', 'Yr_exp']].mean()
Upvotes: 22