RRL
RRL

Reputation: 21

Scoring partial credit on multiple-response multiple answer exam questions in r

I am trying to grade multiple-response multiple choice exam questions using r. I want to create a separate column in my dataframe with the score. The score is given depending on how many right and wrong choices the student has made. For instance, if the right answer is given by choices A & D, and the student answered AB, the score would be +1-1+1-1+1 = 1. Essentially +1 for every right choice and -1 for every wrong choice. In this grading scheme, not choosing E when E is not correct constitutes a right choice.

Here is a sample of what my dataframe looks like:

 mydata <- structure(list(Student = 1:5, Question = c("Q1", "Q1", "Q1", "Q1", "Q1"), 
                     Answer = c("A", "BC", "AD", "AC", "BD"), 
                     Key = c("AD", "AD", "AD", "AD", "AD")),
                     .Names = c("Student", "Question", "Answer", "Key"), 
                     class = "data.frame", row.names = c(NA, -5L))

I cannot figure how to tell r to compare the two columns ("answer" and "key"), identify letters that are either present or absent in both, assign a value to each iteration (A present in both columns, A absent from both columns, B present in both columns, and so on...), and the add those values up.

Alternatively, each separate calculation (A present in both columns, A absent from both columns, B present in both columns, and so on...) could be place in its own column, and the sum be calculated simply.

I have searched through so many posts, but cannot find similar issues. Most posts compare numeric columns and use ><= types of comparisons, which do not work for my problem.

I do appreciate any help you can provide. Thank you in advance!

Upvotes: 2

Views: 443

Answers (1)

C. Braun
C. Braun

Reputation: 5201

Here is a possible way to score the answer using dplyr:

> mydata %>% 
  dplyr::rowwise() %>% 
  dplyr::mutate(score = length(intersect(strsplit(Answer, '')[[1]], strsplit(Key, '')[[1]])))

# A tibble: 5 x 5
  Student Question Answer Key   score
  <int> <chr>    <chr>  <chr> <int>
1       1 Q1       A      AD        1
2       2 Q1       BC     AD        0
3       3 Q1       AD     AD        2
4       4 Q1       AC     AD        1
5       5 Q1       BD     AD        1

Here is another way that accounts for the +1/-1 for each right or wrong choice. Because there isn't a way of knowing what all the possible choices are just from the data, you will have to include that specifically.

all_choices <- c('A', 'B', 'C', 'D', 'E')
for(choice in all_choices) {
   mydata[ , choice] <- 1 + xor(grepl(choice, mydata$Answer), grepl(choice, mydata$Key)) * -2
}
mydata$score <- rowSums(mydata[ , all_choices])

> mydata
  Student Question Answer Key  A  B  C  D E score
1       1       Q1      A  AD  1  1  1 -1 1     3
2       2       Q1     BC  AD -1 -1 -1 -1 1    -3
3       3       Q1     AD  AD  1  1  1  1 1     5
4       4       Q1     AC  AD  1  1 -1 -1 1     1
5       5       Q1     BD  AD -1 -1  1  1 1     1

Upvotes: 3

Related Questions