Reputation: 7644
i have a df:
main_id b_code Scores
1 ABC 0.56
1 ABC 0.21
1 BCD 0.7
1 QWE 0.3
1 ZXC 0.8
2 ABC 0.26
2 ABC 0.81
2 BCD 0.24
2 QWE 0.87
2 ZXC 0.43
I HAVE to find top 2 b_code for each main_id,depending upon their scores.
my final result should be:
main_id b_code Scores
1 ZXC 0.8
1 ABC 0.56
2 QWE 0.87
2 ABC 0.81
i tried to do with groupby and nlargest but results were wrong.
Upvotes: 1
Views: 37
Reputation: 863501
You can use sort_values
+ groupby
+ GroupBy.head
:
df = df.sort_values(['main_id','Scores'], ascending=[True,False]).groupby('main_id').head(2)
print (df)
main_id b_code Scores
4 1 ZXC 0.80
2 1 BCD 0.70
8 2 QWE 0.87
6 2 ABC 0.81
Or set_index
of all columns without main_id
and Scores
+ groupby
+ nlargest
+ reset_index
:
df = df.set_index('b_code').groupby('main_id')['Scores'].nlargest(2).reset_index()
print (df)
main_id b_code Scores
0 1 ZXC 0.80
1 1 BCD 0.70
2 2 QWE 0.87
3 2 ABC 0.81
Upvotes: 2