Reputation: 345
I'm new to pandas, I'm analyzing FIFA dataset, I need to select the top 5 players(based on Overall score), for each positions.
Also,
I want to check, if there are any players, who is present in more than one group, for the above condition. 2. To perform operations such as mean, max on individual groups.
Thanks
I couldn't retrieve other columns like Name and Wage, when I use this operation.
fifa_df.groupby('Position')['Overall'].nlargest(5).groupby('Position').head()
Sample Data:
Expecting results like below:
also I'm trying to find, all the players present in more than one group,
and trying to do operations like average wage for the top 5 players in each position.
dataset link - fifa_dataset
Upvotes: 3
Views: 6305
Reputation: 3910
fifa_df.sort_values('Overall', ascending=False).groupby('Position').head(5).sort_values(['Position', 'Overall'], ascending=False)
Name Position Overall Wage
1 Cristiano Ronaldo ST 94 €405K
10 R. Lewandowski ST 90 €205K
23 S. Agüero ST 89 €300K
16 H. Kane ST 89 €205K
36 G. Bale ST 88 €355K
517 S. Coleman RWB 80 €97K
465 P. Kadeřábek RWB 80 €39K
450 M. Ginter RWB 80 €28K
652 D. Caligiuri RWB 79 €33K
766 Pablo Maffeo RWB 78 €24K
...
Upvotes: 4
Reputation: 3739
try using nlargest in apply
res = df[['Name','Position','Overall','Wage']]
res= res.groupby(['Position'],as_index=False).apply(lambda x: x.nlargest(5, 'Overall'))
res.reset_index(inplace=True)
data.drop(['level_0','level_1'],axis=1,inplace=True)
print(data)
Name Position Overall Wage
0 A. Griezmann CAM 89 €145K
1 C. Eriksen CAM 88 €205K
2 Roberto Firmino CAM 86 €195K
3 T. Müller CAM 86 €135K
4 M. Özil CAM 86 €190K
5 D. Godín CB 90 €125K
6 S. Umtiti CB 87 €205K
7 M. Benatia CB 86 €160K
8 N. Otamendi CB 85 €170K
9 Naldo CB 85 €38K
....
I hope it would solve your problem
Upvotes: 5