Ashu
Ashu

Reputation: 111

How can I create rank variables for each other variables in R?

Hello dear community members. I'm trying to create ranking variables for certain variables in R. For example I want to transform this data frame

> df 
  X1 X2 X3 X4 X5 
1  1  4  7  3  2
2  2  5  8  4  3
3  3  6  3  5  4
4  4  1  2  6  5
5  5  2  1  7  6

into

> df
  X1 X2 X3 X4 X5 x1_rank x2_rank x3_rank
1  1  4  7  3  2       3       2       1
2  2  5  8  4  3       3       2       1
3  3  6  3  5  4       3       1       3
4  4  1  2  6  5       1       3       2
5  5  2  1  7  6       1       2       3

like this (select X1~X3, and make ranking variables between them).

I tried this code

for (i in 1:nrow(df)) {
  df_rank <- df[i, ] %>% 
  dplyr::select(X1, X2, X3, X4) %>% 
  base::rank() 
}

I can imagine I can solve this problem by using for loop but I'm beginner about R so I do not understand why this doesn't work.

Upvotes: 3

Views: 252

Answers (1)

Donald Seinen
Donald Seinen

Reputation: 4419

One way to achieve it is to use the ties argument on negative values.

df <- tibble::tribble(
  ~x1, ~x2, ~x3, ~x4, ~x5,
  1,4,7,3,2,
  2,5,8,4,3,
  3,6,3,5,4,
  4,1,2,6,5,
  5,2,1,7,6
)
library(magrittr)
df %>%
  cbind(
    t(apply(-df[,1:3], 1, rank, ties = "min")) %>% {colnames(.) <- paste0(colnames(.), "_rank"); .}
  )

  x1 x2 x3 x4 x5 x1_rank x2_rank x3_rank
1  1  4  7  3  2       3       2       1
2  2  5  8  4  3       3       2       1
3  3  6  3  5  4       2       1       2
4  4  1  2  6  5       1       3       2
5  5  2  1  7  6       1       2       3

As to why your code does not work - the for loop does not return anything, instead, it assigns a variable df_rank every iteration. To fix it, you could declare an object outside of the loop, and add content to it each iteration, and finally bind that to the original data.

m <- matrix(ncol = 3, nrow = 5)
for (i in 1:nrow(df)) {
  m[i,] <- -df[i, ] %>% 
    dplyr::select(x1, x2, x3) %>% 
    base::rank(ties = "min")
}
colnames(m) <- paste0(names(df)[1:3], "_rank")
df %>% bind_cols(m)

Upvotes: 1

Related Questions