Reputation: 93
I need to convert a dataframe with multiple columns to a "ranked order" table by rows. This is my test data:
raw = [{'MS':100, 'KR':1, 'KU': 0.1, 'BR': 33},
{'MS':9, 'KR':4, 'KU': 0.4, 'BR': 40},
{'MS':40, 'KR':6, 'KU': 0.2, 'BR': 50},
{'MS':59, 'KR':3, 'KU': 0.4, 'BR': 30},
{'MS':33, 'KR':8, 'KU': 0.5, 'BR': 20},
{'MS':40, 'KR':2, 'KU': 0.2, 'BR': 90},
{'MS':70, 'KR':2, 'KU': 0.01, 'BR': 80},
]
raw_df = pd.DataFrame(raw, index=['A','B','C','D','E','F','G'])
Basically what I need is to go from here:
MS KR KU BR
A 100 1 0.10 33
B 9 4 0.40 40
C 40 6 0.20 50
D 59 3 0.40 30
E 33 8 0.50 20
F 40 2 0.20 90
G 70 2 0.01 80
to here(where the highest raw values are ranked higher):
MS KR KU BR
C 40 6 0.20 50
E 33 8 0.50 20
D 59 3 0.40 30
F 40 2 0.20 90
B 9 4 0.40 40
G 70 2 0.01 80
A 100 1 0.10 33
I have rank ordered every column and then created a row rank by taking the average and then sorted the dataframe by that new row. Right now I have the following ranked dataframe but I need to copy over the contents with the old raw data and not the current data.
MS KR KU BR rank
C 4.5 2.0 4.5 3.0 3.500
E 6.0 1.0 1.0 7.0 3.750
D 3.0 4.0 2.5 6.0 3.875
F 4.5 5.5 4.5 1.0 3.875
B 7.0 3.0 2.5 4.0 4.125
G 2.0 5.5 7.0 2.0 4.125
A 1.0 7.0 6.0 5.0 4.750
I can use iterrrow() but it is expensive for large datasets. Is there some panda-fu here that can accomplish this?
Upvotes: 1
Views: 170
Reputation: 1875
If you create it from raw_df you can do:
raw_df.reindex(raw_df.rank(ascending=False).mean(axis=1).sort_values().index)
Output:
MS KR KU BR
C 40 6 0.20 50
E 33 8 0.50 20
D 59 3 0.40 30
F 40 2 0.20 90
B 9 4 0.40 40
G 70 2 0.01 80
A 100 1 0.10 33
We set rank by descending order, takes its mean by columns and sort raw_df by using .reindex()
method.
Upvotes: 1
Reputation: 195408
If I've understood you right: If df_rank
contains your sorted dataframe and raw_df
your unsorted one, you can do:
print(raw_df.loc[df_rank.index])
Prints:
MS KR KU BR
C 40 6 0.20 50
E 33 8 0.50 20
D 59 3 0.40 30
F 40 2 0.20 90
B 9 4 0.40 40
G 70 2 0.01 80
A 100 1 0.10 33
Upvotes: 2