Rank by multiple columns grouping by another column

Question

I am trying to rank a dataframe grouping by a key column as per the value of 2 different columns.

Each row represent a route with a given key. Different routes can share the same key. Each route has a number of touches and a total time. I want to rank by touches and by total time if touches are the same.

Considering the following dataframe

    route   key touches total_tt
0   R1  key1    1   10
1   R2  key1    2   20
2   R3  key1    2   23
3   R4  key1    3   20
4   R5  key1    3   20
5   R6  key1    3   30
6   R7  key2    3   30

I would expect the following ranking

    route   key touches total_tt    rank
0   R1  key1    1   10  1
1   R2  key1    2   20  2
2   R3  key1    2   23  3
3   R4  key1    3   20  4
4   R5  key1    3   20  4
5   R6  key1    3   30  5
6   R7  key2    3   30  1

ALollz · Accepted Answer

sort and groupby + ngroup to label each group with your ranking. Subtracting the minimum rank within each 'key' then gives the desired ranking within group.

cols = ['key', 'touches', 'total_tt']

df['rank'] = df.sort_values(cols).groupby(cols, sort=False).ngroup()
df['rank'] = df['rank'] - df.groupby('key')['rank'].transform('min') + 1

Output:

  route   key  touches  total_tt  rank
0    R1  key1        1        10     1
1    R2  key1        2        20     2
2    R3  key1        2        23     3
3    R4  key1        3        20     4
4    R5  key1        3        20     4
5    R6  key1        3        30     5
6    R7  key2        3        30     1

Rank by multiple columns grouping by another column

Answers (1)

Related Questions