Reputation: 3747
Is it possible to do something like this
df = pd.DataFrame({
"sort_by": ["a","a","a","a","b","b","b", "a"],
"x": [100.5,200,200,500,1,2,3, 200],
"y": [4000,2000,2000,1000,500.5,600.5,600.5, 100.5]
})
df = df.sort_values(by=["x","y"], ascending=False)
where I can sort by the sort_by column and use x and y to find the rank (using y to break ties)
so ideal outlook will be
sort_by x y rank
a 500 1000 1
a 200 2000 2
a 200 2000 2
a 200 100.5 3
a 100.5 4000 4
b 3 600.5 1
b 2 600.5 2
b 1 500.5 3
Upvotes: 2
Views: 2666
Reputation: 323226
Check with factorize
after sort_values
df = df.sort_values(by=["x","y"], ascending=False)
df['rank']=tuple(zip(df.x,df.y))
df['rank']=df.groupby('sort_by',sort=False)['rank'].apply(lambda x : pd.Series(pd.factorize(x)[0])).values
df
Out[615]:
sort_by x y rank
3 a 500.0 1000.0 1
1 a 200.0 2000.0 2
2 a 200.0 2000.0 2
7 a 200.0 100.5 3
0 a 100.5 4000.0 4
6 b 3.0 600.5 1
5 b 2.0 600.5 2
4 b 1.0 500.5 3
Upvotes: 2