Nivi
Nivi

Reputation: 1147

How do I create a 'Rank' column in pandas?

I am trying to rank a column (col 1) in pandas. If there is a tie, I want to look at another column (col 2) just for those records and do a tie breaker. IF they are same even in that column, I want to just assign the ranks randomly but each row must have a unique rank.

Example:

col1  |  col 2 | Rank
 20   |  3     | 3
 22   |  2     | 2
 22   |  2.5   | 1
 3    |  1     | 4
 3    |  1     | 5

Upvotes: 2

Views: 1576

Answers (1)

Fred
Fred

Reputation: 1492

df['Rank'] = df.sort_values(by=['col1', 'col2'], ascending=False) \
               .reset_index() \
               .sort_values('index') \
               .index + 1

This code goes through these steps:

  1. Sorts the data frame into a new temporary one
  2. Resets the index, keeping the old one as a new column called 'index'. Make sure you don't have any starting columns named 'index'. If you do have one, the new column is called level_0. If you have both index and level_0, the code will throw an exception.
  3. Sorts the temporary dataframe according to the old index (so it's in the original order again)
  4. Uses the new index (which is in the rank order we want) as the new 'Rank' column. Plus 1 because your question seems to want ranks that start at 1, not 0

Upvotes: 6

Related Questions