Reputation: 53
I need to get the average of the working years for each name in the HR department.
I tried this
work = df.loc[Employee['Department'] == 'HR', [{'Year' : 'mean'}],
['FirstName', 'LastName', 'Year','Department']].drop_duplicates()
The result would be like this. The average values are arbitrary I did not calculate them
FirstName LastName aver_Year Department
0 Joe Faulk 3.00 HR
1 Bryce Benton 5.00 HR
2 Sarah Cronin 2.00 HR
3 Gabriel Montgomery 5.00 HR
4 Patricia Genty-Andrade 6.00 HR
The source dataframe
FirstName LastName Year Department
0 Joan Hamilton-Huber 2 HR
1 Nathan Brigmon 5 AustinCodeDepartment
2 Shawn Lincoln 8 HR
3 Chris Hernandez 2 AustinConventionCenter
4 John Montgomery 7 AustinEnergy
Upvotes: 0
Views: 981
Reputation: 338
I would use the group by function of pandas:
df_gb = df.groupby(['Department','FirstName','LastName'])['Year'].mean().reset_index()
df_gb = df_gb[df_gb['Department']=='HR']
The first line gives you the output you want, the average of years by department and name. Then you filter by the HR department and you obtain the data you want on the df_gb dataframe.
Upvotes: 2
Reputation: 391
First, you can select only the items related to the HR department:
df = df[df['Department'] == 'HR']
After that, you can group by Name, Surname and ask for the average of the column Year as follows
work = df.groupby(['FirstName', 'LastName'])['Year'].mean()
Eventually, you can rename the column Year to aver_Year by doing
work.rename(columns={'Year':'aver_Year'}, inplace=True)
Upvotes: 0