NJD
NJD

Reputation: 199

How to Sort/Rank a Pandas Dataframe based on multiple columns

I have a pandas table such as:

Entries Col1    Col2    Col3    Col4
Entry1  -1.46   93.93   3.33    92.51   
Entry2  -48.59  31.49   -22.97  80.25
Entry3  8.24    95.85   -5.05   90.29

I want to sort the Entries based on all 4 columns. Column 1 and 3 should be ranked by which is closer to 0 and Column 2 and 4 should be ranked by highest number.

At the moment I have something like this:

data.sort_values(cols, ascending=[False,True,False,True],inplace=True)

But all this does is sort by the first column and the other columns are insignificant to the sorting. I need the Entries to be sorted by all the columns. If Entry 1 is only the best in Column 1 and Entry 2 is best in the other 3 then Entry 3 should be sorted to the top.

Expected output:

Entries Col1    Col2    Col3    Col4
Entry1  -1.46   93.93   3.33    92.51   
Entry3  8.24    95.85   -5.05   90.29
Entry2  -48.59  31.49   -22.97  80.25

Entry 1 is best in Col:1,3 and 4 Entry 2 is worse in all Cols Entry 3 is best in Col2 and second best in others.

Thanks.

Upvotes: 0

Views: 1915

Answers (1)

Valdi_Bo
Valdi_Bo

Reputation: 30971

Start from creating 2 auxiliary columns, Col1a and Col3a holding absolute values of respective source columns:

data['Col1a'] = data.Col1.abs()
data['Col3a'] = data.Col3.abs()

The sort your DataFrame:

data.sort_values(['Col1a', 'Col2', 'Col3a',  'Col4'],
    ascending=[True, False, True, False], inplace=True)

Note that the ascending parameter is different than in your code.

And finally, drop the auxiliary columns:

data.drop(columns=['Col1a', 'Col3a'], inplace=True)

Upvotes: 1

Related Questions