Reputation: 841

Adding a ranking column to a dataframe

This seems like it must be a very common task, but I can't find a solution in google or SO. I want to add a column called 'rank' to 'dat1' based on the sequence that 'order.scores' applies to 'dat'. I tried using row.names(), but the rownames are based on 'dat', not 'dat1'. I also tried 'dat$rank <-rank(dat1)', but this produces an error message.

fname<-c("Joe", "Bob", "Bill", "Tom", "Sue","Sam","Jane","Ruby")
score<-c(500, 490, 500, 750, 550, 500, 210, 320)
dat<-data.frame(fname,score)
order.scores<-order(dat$score,dat$fname)
dat1<-dat[order.scores,]

Upvotes: 15

Answers (7)

Anil kumar

Reputation: 19

Generally, Rank can be applied to find the least to highest in numerical values of a column data.

example: Salary is a column and it has 4 digit salary to 5 digit salary then here it goes by applying rank function!

simple understanding - the rank of salaries among them.

df['Salary'].rank(ascending = False).astype(int)

Upvotes: -1

Sandy

Reputation: 1148

For the given dataframe dat:

  fname score
  Joe   500
  Bob   490
  Bill  500
  Tom   750
  Sue   550
  Sam   500
  Jane  210
  Ruby  320

We can also use dplyr as below, it assigns the lowest rank to the smallest value, which is 210 in this case.

ranks = dat %>%
  mutate(ranks = order(order(score)))

The output will be as below:

fname score ranks
  Joe   500     4
  Bob   490     3
 Bill   500     5
  Tom   750     8
  Sue   550     7
  Sam   500     6
 Jane   210     1
 Ruby   320     2

If the converse is required, i.e., rank 1 should be assigned to the highest value which is 750 in this case, then the code will be changed slightly as below:

ranks = dat %>%
  mutate(ranks = order(order(score, decreasing = T)))

The output in this case will be as below:

fname score ranks
Joe   500     3
Bob   490     6
Bill  500     4
Tom   750     1
Sue   550     2
Sam   500     5
Jane  210     8
Ruby  320     7

Upvotes: 0

prmlmu

Reputation: 663

You can also use arrange and mutate from dplyr:

library(dplyr)
dat <- arrange(dat, desc(score)) %>%
          mutate(rank = 1:nrow(dat))
dat

Upvotes: 5

Anshuman Kirty

Reputation: 176

You can use:

dat$Rank <-  rank(dat$score)
dat$Rank

Upvotes: 2

akrun

Reputation: 887991

you could do:

 dat$rank <- order(order.scores)
  dat$rank
 #[1] 5 3 4 8 7 6 1 2

Upvotes: 1

r2evans

Reputation: 161155

Try:

## dat, dat1, and order.scores as defined
dat <- data.frame(fname=c("Joe", "Bob", "Bill", "Tom", "Sue","Sam","Jane","Ruby"),
                  score=c(500, 490, 500, 750, 550, 500, 210, 320))
order.scores <- order(dat$score)
dat1 <- dat[order.scores,]
dat1$rank <- rank(dat1$score)
dat1
##    fname score rank
##  7  Jane   210    1
##  8  Ruby   320    2
##  2   Bob   490    3
##  3  Bill   500    5
##  1   Joe   500    5
##  6   Sam   500    5
##  5   Sue   550    7
##  4   Tom   750    8

This shows the ties in rank based on $score. If you don't want ties in $rank, then you might as well say dat1$rank <- 1:nrow(dat1) since they are already in order.

Upvotes: 9

josliber

Reputation: 44340

You can compute a ranking from an ordering as follows:

dat$rank <- NA
dat$rank[order.scores] <- 1:nrow(dat)
dat
#   fname score rank
# 1   Joe   500    5
# 2   Bob   490    3
# 3  Bill   500    4
# 4   Tom   750    8
# 5   Sue   550    7
# 6   Sam   500    6
# 7  Jane   210    1
# 8  Ruby   320    2

Upvotes: 15

Adding a ranking column to a dataframe

Answers (7)

Related Questions