Shubham R
Shubham R

Reputation: 7644

Groupby one column and top N from other columns pandas

i have a df:

main_id    b_code    Scores
  1          ABC      0.56
  1          ABC      0.21
  1          BCD      0.7
  1          QWE      0.3
  1          ZXC      0.8
  2          ABC      0.26
  2          ABC      0.81
  2          BCD      0.24
  2          QWE      0.87
  2          ZXC      0.43

I HAVE to find top 2 b_code for each main_id,depending upon their scores.

my final result should be:

main_id    b_code   Scores
1           ZXC      0.8
1           ABC      0.56
2           QWE      0.87
2           ABC      0.81

i tried to do with groupby and nlargest but results were wrong.

Upvotes: 1

Views: 37

Answers (1)

jezrael
jezrael

Reputation: 863501

You can use sort_values + groupby + GroupBy.head:

df = df.sort_values(['main_id','Scores'], ascending=[True,False]).groupby('main_id').head(2)
print (df)
   main_id b_code  Scores
4        1    ZXC    0.80
2        1    BCD    0.70
8        2    QWE    0.87
6        2    ABC    0.81

Or set_index of all columns without main_id and Scores + groupby + nlargest + reset_index:

df = df.set_index('b_code').groupby('main_id')['Scores'].nlargest(2).reset_index()
print (df)
   main_id b_code  Scores
0        1    ZXC    0.80
1        1    BCD    0.70
2        2    QWE    0.87
3        2    ABC    0.81

Upvotes: 2

Related Questions