An efficient way to score two tests in an R data.table

Question

Suppose I have the following data.table with answers to two different tests, red and blue:

library(data.table)
dt <- data.table(
  class = rep("math", 4),
  test = c("red", "red", "blue", "red"),
  student = 1:4,
  q1_answer = c("a", "a", "b", "a"),
  q2_answer = c("b", "c", "b", NA),
  q3_answer = c("c", "c", "c", NA)
)
# dt
#   class test student q1_answer q2_answer q3_answer
#1:  math  red       1         a         b         c
#2:  math  red       2         a         c         c
#3:  math blue       3         b         b         c
#4:  math  red       4         a

The answer keys for the blue and red tests are the following:

red_answer_key <- c("a", "b", "c")
blue_answer_key <- c("b", "c", "d")

How could I score the two tests so I would have the score column in the following table?

#   class test student q1_answer q2_answer q3_answer score
#1:  math  red       1         a         b         c     3
#2:  math  red       2         a         c         c     2
#3:  math blue       3         b         b         c     1
#4:  math  red       4         a                 1  # count NA as incorrect

s_baldur · Accepted Answer

One option:

key_list <- list(
  red = red_answer_key,
  blue = blue_answer_key
)
dt_long <- dt[, melt(.SD, id.vars = c("class", "test", "student"))]
dt_scores <- dt_long[, .(score = sum(value == key_list[[test]])), keyby = .(student, test)]
# Join back inz
dt[, score := dt_scores[.SD, on = .(student, test), score]]

#    class test student q1_answer q2_answer q3_answer score
# 1:  math  red       1         a         b         c     3
# 2:  math  red       2         a         c         c     2
# 3:  math blue       3         b         b         c     1
# 4:  math  red       4         a                NA

An efficient way to score two tests in an R data.table

Answers (1)

Related Questions