Mohamed Abass
Mohamed Abass

Reputation: 81

how to rank rows at python using pandas in multi columns

suppose i have the following pandas dataframe , and i need to rank rows at new columns ( i meant if i want to rank 4 rows i will creat 4 new rows )

at the following dataframe , i have three numerical columns , i need to compare and rank each row , there is three rows so i need to craete three new columns to compare the value in each colmuns with the row

Revenue-SaleCount-salesprices-ranka-rankb-rankc

300------10-----------8000--------2--------1-----3

100----9000-----------1000--------1--------3-----2

how can i do that with simple code and using for loop thanks in advance

import pandas as pd

df = pd.DataFrame({'Revenue':[300,9000,1000,750,500,2000,0,600,50,500],
    'Date':['2016-12-02' for i in range(10)],
    'SaleCount':[10,100,30,35,20,100,0,30,2,20],
    'salesprices':[8000,1000,500,700,2500,3800,16,7400,3200,21]})


print(df)

Upvotes: 0

Views: 414

Answers (1)

Erfan
Erfan

Reputation: 42916

We can write a loop with string.ascii_lowercase and make each column with rank over axis=1

import string

cols = ['Revenue', 'SaleCount', 'salesprices']

for index, col in enumerate(cols):
    df[f'rank{string.ascii_lowercase[index]}'] = df[cols].rank(axis=1)[col]

Output:

print(df)
   Revenue        Date  SaleCount  salesprices  ranka  rankb  rankc
0      300  2016-12-02         10         8000    2.0    1.0    3.0
1     9000  2016-12-02        100         1000    3.0    1.0    2.0
2     1000  2016-12-02         30          500    3.0    1.0    2.0
3      750  2016-12-02         35          700    3.0    1.0    2.0
4      500  2016-12-02         20         2500    2.0    1.0    3.0
5     2000  2016-12-02        100         3800    2.0    1.0    3.0
6        0  2016-12-02          0           16    1.5    1.5    3.0
7      600  2016-12-02         30         7400    2.0    1.0    3.0
8       50  2016-12-02          2         3200    2.0    1.0    3.0
9      500  2016-12-02         20           21    3.0    1.0    2.0

Note I used f-string which is only supported with Python version > 3.4. Else use .format string formatting like following:

import string

cols = ['Revenue', 'SaleCount', 'salesprices']

for index, col in enumerate(cols):
    df['rank{}'.format(string.ascii_lowercase[index])] = df[cols].rank(axis=1)[col] 

Upvotes: 2

Related Questions