Reputation: 73
If i have the following dataframe
Base Pay Overtime Pay Other Pay Benefits
Adam 200000 31000 5000 64000
Ben 210000 27000 7000 57000
Scott 190000 40000 9000 65000
David 220000 26000 4000 61000
Matthew 195000 29000 10000 63000
Mark 205000 37000 8000 59000
Then i am looking to return the following dataframe
1st 2nd 3rd
Base Pay David Ben Mark
Overtime Pay Scott Mark Adam
Other Pay Matthew Scott Mark
Benefits Scott Adam Matthew
I know how to compute the 3 largest values in each column, but not simultaneously.
Upvotes: 0
Views: 78
Reputation: 22503
Using argsort
:
df = pd.DataFrame(...).T
result = pd.DataFrame(df.columns[(-df.values).argsort(axis=1)[:, :3]],
columns=["1st","2nd","3rd"],
index=df.index)
print (result)
#
1st 2nd 3rd
BasePay David Ben Mark
OvertimePay Scott Mark Adam
OtherPay Matthew Scott Mark
Benefits Scott Adam Matthew
Upvotes: 3
Reputation: 908
A little shorter:
df = df.T.apply(lambda s: s.abs().nlargest(3).index.tolist(), axis=1)
>>> df2 = pd.DataFrame()
>>> df2[['1st','2nd','3rd']] = pd.DataFrame(df.values.tolist(), index= df.index)
Upvotes: 2
Reputation: 323276
Here is one way
s=df.stack().sort_values(ascending=False).groupby(level=1).head(3).reset_index()
s['Id']=s.groupby('level_1').cumcount()+1
s.pivot(index='level_1',columns='Id',values='level_0')
Out[114]:
Id 1 2 3
level_1
BasePay David Ben Mark
Benefits Scott Adam Matthew
OtherPay Matthew Scott Mark
OvertimePay Scott Mark Adam
Upvotes: 1