Reputation: 170
I am trying to rank a dataframe grouping by a key column as per the value of 2 different columns.
Each row represent a route with a given key. Different routes can share the same key. Each route has a number of touches and a total time. I want to rank by touches and by total time if touches are the same.
Considering the following dataframe
route key touches total_tt
0 R1 key1 1 10
1 R2 key1 2 20
2 R3 key1 2 23
3 R4 key1 3 20
4 R5 key1 3 20
5 R6 key1 3 30
6 R7 key2 3 30
I would expect the following ranking
route key touches total_tt rank
0 R1 key1 1 10 1
1 R2 key1 2 20 2
2 R3 key1 2 23 3
3 R4 key1 3 20 4
4 R5 key1 3 20 4
5 R6 key1 3 30 5
6 R7 key2 3 30 1
Upvotes: 2
Views: 950
Reputation: 59519
sort
and groupby
+ ngroup
to label each group with your ranking. Subtracting the minimum rank within each 'key'
then gives the desired ranking within group.
cols = ['key', 'touches', 'total_tt']
df['rank'] = df.sort_values(cols).groupby(cols, sort=False).ngroup()
df['rank'] = df['rank'] - df.groupby('key')['rank'].transform('min') + 1
Output:
route key touches total_tt rank
0 R1 key1 1 10 1
1 R2 key1 2 20 2
2 R3 key1 2 23 3
3 R4 key1 3 20 4
4 R5 key1 3 20 4
5 R6 key1 3 30 5
6 R7 key2 3 30 1
Upvotes: 2