Maths12
Maths12

Reputation: 999

How can i get the top 3 smallest and top 3 highest value for each row in pandas?

I have a table like below:

group    shopspend  clothspend  foodspend music spend

A          1              2        12       34
B          1              2        12       34
C          1              2        12       34
d          1              2        12       34

I have alot more columns that above in my orignal frame. How do i return (for each group so by row) the top 3 lowest and top 3 highest per row and return the corresponding column value? Note: group is an index:

df = df.set_index('group') 

df.apply(lambda x: df.columns[np.argsort(x)], 1).iloc(axis=0)[:, :3]

tried above but it says IndexingError: Too many indexers.. not sure how i can get top highest and top 3 lowest columns for each row..

Upvotes: 0

Views: 810

Answers (1)

Cimbali
Cimbali

Reputation: 11415

You can use .agg() along the columns (thus applying it on each row) and return a Series with the appropriate elements. To find out which they are, use nlargest and nsmallest:

>>> df.agg(lambda s: pd.Series([*s.nlargest(3).index, *s.nsmallest(3).index],
...                            ['max1', 'max2', 'max3', 'min1', 'min2', 'min3']),
... axis='columns')
             max1       max2        max3       min1        min2       min3
group                                                                     
A      musicspend  foodspend  clothspend  shopspend  clothspend  foodspend
B      musicspend  foodspend  clothspend  shopspend  clothspend  foodspend
C      musicspend  foodspend  clothspend  shopspend  clothspend  foodspend
d      musicspend  foodspend  clothspend  shopspend  clothspend  foodspend

Some columns are duplicated but it’s because there are only 4 in your example: here, the 3rd smallest is also the 2nd largest and so on.

Upvotes: 5

Related Questions