Reputation: 999
I have a table like below:
group shopspend clothspend foodspend music spend
A 1 2 12 34
B 1 2 12 34
C 1 2 12 34
d 1 2 12 34
I have alot more columns that above in my orignal frame. How do i return (for each group so by row) the top 3 lowest and top 3 highest per row and return the corresponding column value? Note: group is an index:
df = df.set_index('group')
df.apply(lambda x: df.columns[np.argsort(x)], 1).iloc(axis=0)[:, :3]
tried above but it says IndexingError: Too many indexers.. not sure how i can get top highest and top 3 lowest columns for each row..
Upvotes: 0
Views: 810
Reputation: 11415
You can use .agg()
along the columns (thus applying it on each row) and return a Series with the appropriate elements. To find out which they are, use nlargest
and nsmallest
:
>>> df.agg(lambda s: pd.Series([*s.nlargest(3).index, *s.nsmallest(3).index],
... ['max1', 'max2', 'max3', 'min1', 'min2', 'min3']),
... axis='columns')
max1 max2 max3 min1 min2 min3
group
A musicspend foodspend clothspend shopspend clothspend foodspend
B musicspend foodspend clothspend shopspend clothspend foodspend
C musicspend foodspend clothspend shopspend clothspend foodspend
d musicspend foodspend clothspend shopspend clothspend foodspend
Some columns are duplicated but it’s because there are only 4 in your example: here, the 3rd smallest is also the 2nd largest and so on.
Upvotes: 5