Ama
Ama

Reputation: 53

get the mean of each value in a pandas dataframe column

I need to get the average of the working years for each name in the HR department.

I tried this

work = df.loc[Employee['Department'] == 'HR', [{'Year' : 'mean'}],
                 ['FirstName', 'LastName', 'Year','Department']].drop_duplicates()

The result would be like this. The average values are arbitrary I did not calculate them

   FirstName   LastName        aver_Year   Department   
0  Joe         Faulk           3.00        HR
1  Bryce       Benton          5.00        HR
2  Sarah       Cronin          2.00        HR
3  Gabriel     Montgomery      5.00        HR
4  Patricia    Genty-Andrade   6.00        HR

The source dataframe

FirstName   LastName        Year    Department
0   Joan    Hamilton-Huber  2       HR
1   Nathan  Brigmon         5       AustinCodeDepartment
2   Shawn   Lincoln         8       HR
3   Chris   Hernandez       2       AustinConventionCenter
4   John    Montgomery      7       AustinEnergy

Upvotes: 0

Views: 981

Answers (2)

DanCor
DanCor

Reputation: 338

I would use the group by function of pandas:

df_gb = df.groupby(['Department','FirstName','LastName'])['Year'].mean().reset_index()
df_gb = df_gb[df_gb['Department']=='HR']

The first line gives you the output you want, the average of years by department and name. Then you filter by the HR department and you obtain the data you want on the df_gb dataframe.

Upvotes: 2

mht
mht

Reputation: 391

First, you can select only the items related to the HR department:

df = df[df['Department'] == 'HR']

After that, you can group by Name, Surname and ask for the average of the column Year as follows

work = df.groupby(['FirstName', 'LastName'])['Year'].mean()

Eventually, you can rename the column Year to aver_Year by doing

work.rename(columns={'Year':'aver_Year'}, inplace=True)

Upvotes: 0

Related Questions