Jay
Jay

Reputation: 55

Extract top N rows for each value, based on the value in column

Using pandas or other function in python,

I'd like to extract top 3 rows (based on the "weight" value) for each "name":

For example, I have a data like this.

    df = 
row weight name size1 size2 
    1 10 A 4 4
    2 7 A 2 7
    3 5 A 9 7
    4 5 A 2 2
    5 2 A 6 3
    1 7 B 3 4
    2 6 B 8 3
    3 5 B 4 3
    4 3 B 4 5
    5 2 B 2 1

I'd like my for my output to look like this:

row weight name size1 size2
1 10 A 4 4
2 7 A 2 7
3 5 A 9 7
1 7 B 3 4
2 6 B 8 3
3 5 B 4 3

or

(when 3rd, 4th "weight" value are same)
row weight name size1 size2
1 10 A 4 4
2 7 A 2 7
3 5 A 9 7
4 5 A 2 2
1 7 B 3 4
2 6 B 8 3
3 5 B 4 3

Upvotes: 1

Views: 214

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

Try this:

In [36]: df.groupby('name', group_keys=False).apply(lambda x: x.nlargest(3, 'weight'))
Out[36]:
   row  weight name  size1  size2
0    1      10    A      4      4
1    2       7    A      2      7
2    3       5    A      9      7
5    1       7    B      3      4
6    2       6    B      8      3
7    3       5    B      4      3

Upvotes: 1

Related Questions