Reputation: 69
Disclaimer: I think there is a much more efficient solution (perhaps an anonymous function with a list or *apply functions?) hence why I have come to you much more experienced people for help!
The data
Let's say I have a df with participant responses to 3 question As and 3 question Bs e.g.
qa1, qa2, qa3, qb1, qb2, qb3
1, 3, 1, 2, 4, 4
1, 3, 2, 2, 1, 4
2, 3, 1, 2, 1, 4
1, 3, 2, 1, 1, 3
EDIT df also contains other columns with other irrelevant data!
I have a vector with correct answers to each of qa1-3 and qb1-3 in sequence with the columns.
correct_answer <- c(1,3,2,2,1,4)
(i.e. for qa1,qa2,qa3,qb1,qb2,qb3)
Desired manipulation
I want to create a new column per question (e.g. qa1_correct), coding for whether the participant has responded correctly (1) or incorrectly (0) based on matching each response in df with corresponding answer in correct_answer. Ideally I would end up with:
qa1, qa2, qa3, qb1, qb2, qb3, qa1_correct, qa2_correct, qa3_correct ...
1, 3, 1, 2, 4, 4, 1, 1, 0, ...
1, 3, 2, 2, 1, 4, 1, 1, 1, ...
2, 3, 1, 2, 1, 4, 0, 1, 0, ...
1, 3, 2, 1, 1, 3, 1, 1, 1, ...
Failed Attempt
This is my attempt for question As only (would repeat for Bs) but it doesn't work (maybe wrong function paste0()?):
index <- c(1:3)
for (i in index) {
df <- df %>% mutate(paste0("qa",i,"_correct") =
case_when(paste0("qa"i) == correct_answer[i] ~ 1,
paste0("qa"i) != correct_answer[i] ~ 0))
}
Many thanks for any guidance!
Upvotes: 0
Views: 2999
Reputation: 21908
You can also use the following solution in base R:
cbind(df,
do.call(cbind, mapply(function(x, y) as.data.frame({+(x == y)}),
df, correct_answer, SIMPLIFY = FALSE)) |>
setNames(paste0(names(df), "_corr")))
qa1 qa2 qa3 qb1 qb2 qb3 qa1_corr qa2_corr qa3_corr qb1_corr qb2_corr qb3_corr
1 1 3 1 2 4 4 1 1 0 1 0 1
2 1 3 2 2 1 4 0 0 0 0 0 0
3 2 3 1 2 1 4 1 0 0 0 0 0
4 1 3 2 1 1 3 1 1 1 0 1 0
Or a potential tidyverse
solution could be:
library(tidyr)
library(purrr)
df %>%
mutate(output = pmap(df, ~ setNames(+(c(...) == correct_answer),
paste0(names(df), "_corr")))) %>%
unnest_wider(output)
qa1 qa2 qa3 qb1 qb2 qb3 qa1_corr qa2_corr qa3_corr qb1_corr qb2_corr qb3_corr
1 1 3 1 2 4 4 1 1 0 1 0 1
2 1 3 2 2 1 4 0 0 0 0 0 0
3 2 3 1 2 1 4 1 0 0 0 0 0
4 1 3 2 1 1 3 1 1 1 0 1 0
Upvotes: 2
Reputation: 26218
This may also be an alternative (In R version 4.1.0 onwards that has made apply gain a new argument simplify with default TRUE)
df <- read.table(header = T, text = 'qa1, qa2, qa3, qb1, qb2, qb3
1, 3, 1, 2, 4, 4
1, 3, 2, 2, 1, 4
2, 3, 1, 2, 1, 4
1, 3, 2, 1, 1, 3', sep = ',')
df
#> qa1 qa2 qa3 qb1 qb2 qb3
#> 1 1 3 1 2 4 4
#> 2 1 3 2 2 1 4
#> 3 2 3 1 2 1 4
#> 4 1 3 2 1 1 3
correct_answer <- c(1,3,2,2,1,4)
cbind(df,
setNames(as.data.frame(t(apply(df, 1,
\(x) +(x == correct_answer)))),
paste0(names(df), '_correct')))
#> qa1 qa2 qa3 qb1 qb2 qb3 qa1_correct qa2_correct qa3_correct qb1_correct
#> 1 1 3 1 2 4 4 1 1 0 1
#> 2 1 3 2 2 1 4 1 1 1 1
#> 3 2 3 1 2 1 4 0 1 0 1
#> 4 1 3 2 1 1 3 1 1 1 0
#> qb2_correct qb3_correct
#> 1 0 1
#> 2 1 1
#> 3 1 1
#> 4 1 0
Created on 2021-07-23 by the reprex package (v2.0.0)
Upvotes: 2
Reputation: 3671
You can combine mutate
and across
.
Code 1: Correct_answer as vector
df %>%
mutate(across(everything(),
~as.numeric(.x == correct_answer[names(df) == cur_column()]),
.names = "{.col}_correct"))
Code 2: Correct_answer as data.frame (df_correct)
correct_answer <- c(1,3,2,2,1,4)
df_correct <- data.frame(
matrix(correct_answer, ncol = length(correct_answer))
)
colnames(df_correct) <- names(df)
df %>%
mutate(across(everything(),
.fn = ~as.numeric(.x == df_correct[,cur_column()]),
.names = "{.col}_correct"))
Output
qa1 qa2 qa3 qb1 qb2 qb3 qa1_correct qa2_correct qa3_correct qb1_correct qb2_correct qb3_correct
1 1 3 1 2 4 4 1 1 0 1 0 1
2 1 3 2 2 1 4 1 1 1 1 1 1
3 2 3 1 2 1 4 0 1 0 1 1 1
4 1 3 2 1 1 3 1 1 1 0 1 0
Upvotes: 2
Reputation: 69
EDIT works with addition of sym()
Found a related solution here Paste variable name in mutate (dplyr) but it only pastes 0's
for (i in index) {
df <- df %>% mutate( !!paste0("qa",i,"_correct") :=
case_when(!!sym(paste0("qa",i)) == correct_answer[i] ~ 1,
!!sym(paste0("qa",i)) != correct_answer[i] ~ 0))
}
Upvotes: 0
Reputation: 2949
Try this:
df_new <- cbind(df, t(apply(df, 1, function(x) as.numeric(x == correct_answer))))
Upvotes: 0