Reputation: 43

Creating a sequential ranking based on previous ratings

I have an issue with sequentially updating rankings and no matter how I try to search for a solution - or come up with one myself - I fail.

I am trying to analyse results of an experiment of sequential choice in which participants had to find the best possible option (the option with the highest rating). They were presented with a rating in every trial.

I have an ID, an order and a rating variable for every choice. ID is the participant, rating represents how good the option is (the higher the rating the better) and order is the number of the trial (in this example there were 4 trials)

ID       rating        order
1        4             1
1        3             2
1        5             3
1        2             4
2        3             1
2        5             2
2        2             3
2        1             4

I would like to create a new variable called "current_rank" which is basically the ranking of the rating of the current choice. This variable always needs to take into consideration all previous trials and ratings so e.g. for the participant with ID "1" this would be:

Trial 1: rating = 4, which means this is the best rating so far, current_rank = 1

Trial 2: rating = 3, which means this is the second best rating so far, current_rank = 2

Trial 3: rating = 5, which means this is the best rating so far, making it the new number 1 so, current_rank = 1

Trial 4: rating = 2, which means this is nowhere near the best, current_rank = 4

If I could do this with all participants and all choices my database should look like this:

ID       rating        order        current_rank
1        4             1            1
1        3             2            2
1        5             3            1
1        2             4            4
2        3             1            1
2        5             2            1
2        2             3            3
2        1             4            4

I could successfully create an overall ranking variable like this:

db %>%
  arrange(ID, order) %>%
  group_by(ID) %>%
  mutate(ovr_rank = min_rank(desc(rating)))

But my goal is to create a variable that is something of a sequential ranking. This would make it possible to see what kind of opinion the participant may have formed about the current rating based on the previous ratings, without knowing what future ratings might be. I tried creating loops or use the apply functions, but couldn't come up with a solution yet.

Any and all ideas are greatly appreciated!

Upvotes: 4

Answers (3)

chinsoon12

Reputation: 25225

Here are 2 options using data.table:

1) non-equi join to find all trials before and incl current trial, rank the rating and extract the current rank:

DT[, cr := .SD[.SD, on=.(ID, trial<=trial), by=.EACHI, order(order(-rating))[.N]]$V1]

2) non-equi join to find number of ratings that are higher than current rating in trials before current trial:

DT[, cr2 := DT[DT, on=.(ID, trial<=trial, rating>rating), by=.EACHI, .N + 1L]$V1]

Note that there might be ties in ratings and it will be good to specify how ratings ties should be handled.

output:

   ID rating trial cr cr2
1:  1      4     1  1   1
2:  1      3     2  2   2
3:  1      5     3  1   1
4:  1      2     4  4   4
5:  2      3     1  1   1
6:  2      5     2  1   1
7:  2      2     3  3   3
8:  2      1     4  4   4

data:

library(data.table)
DT <- fread("ID       rating        trial
1        4             1
1        3             2
1        5             3
1        2             4
2        3             1
2        5             2
2        2             3
2        1             4")

Upvotes: 1

GoGonzo

Reputation: 2867

Use runner to apply any R function in cumulative window (or rolling window). Below I used runner which rolls rating and applies rank function on "available" data at the moment (cumulative rank). Uncomment print to exhibit what lands into function(x).

library(dplyr)
library(runner)

data %>%
  arrange(ID, order) %>%
  group_by(ID) %>%
  mutate(
    current_rank = runner(
      x = rating,
      f = function(x) {
        # print(x)
        rank_available_at_the_moment <- rank(-x, ties.method = "last")
        tail(rank_available_at_the_moment, 1)
      }
    )
  )


# # A tibble: 8 x 4
# # Groups:   ID [2]
#       ID rating order current_rank
#     <int>  <int> <int>        <int>
# 1     1      4     1            1
# 2     1      3     2            2
# 3     1      5     3            1
# 4     1      2     4            4
# 5     2      3     1            1
# 6     2      5     2            1
# 7     2      2     3            3
# 8     2      1     4            4

data

data <- read.table(text = "ID       rating        order
1        4             1
1        3             2
1        5             3
1        2             4
2        3             1
2        5             2
2        2             3
2        1             4", header = TRUE)

Upvotes: 2

Nuclear241

Reputation: 570

This chunk of code will work:

df <- tibble(
  ID = c(1,1,1,1,2,2,2,2),
  rating = c(4,3,5,2,3,5,2,1),
  rank = c(1,0,0,0,0,0,0,0)
)

for(i in 2:nrow(df)){
  if(df$ID[i] != df$ID[i-1]){
    df$rank[i] <- 1
  } else {
    df$rank[i] <- which(sort(df[1:i,]$rating[which(df$ID == df$ID[i])], decreasing = TRUE) == df$rating[i])
  }
}

Explanation:

Note that I assume your dataframe is already ordered based on ID and order. In my df there is no order column, but it is mainly for simplicity (and it is not necessarily needed in my solution, again, assuming the rows are already ordered by ID and order).

The for loop simply looks if the ID of that row is different from the row above, it automatically gets rank 1. Otherwise, it looks on the subset of df from row 1 to row i, subsets again by similar ID, sorts the ratings in that subset (including our current rating in question) in descending order, and takes the position of our currently asked rating to be assigned as its rank value.

I hope this answers your question and gives you insight.

Upvotes: 1

Creating a sequential ranking based on previous ratings

Answers (3)

Related Questions