Reputation: 45
I'm trying to automate creating variables indicating whether students' answer (variables beginning with l,m, f or g) to the questions (eg. variables starting in "test_") are correct or not. ie. This done by checking whether, for example, test_l1 == l1.
I cannot figure out how to do this other than using the index, but it's very tedious and creates a lot of codes.
Below is a toy dataset that mimics the structure of the actual dataset which has 4 different kinds of tests with 12 exercises each (test_l1 ~ test_l12, test_m1 ~ test_m12, test_f1~,test_g1~) and corresponding student responses (l1~l12, m1~m12, f1~, g1~). I would like to create 48 variables that are namely correct_l1 ~ correct_l12, correct_m1~, correct_f1~ etc.)
df <- data.frame(test_l1 = c(1,0,0),
test_l2=c(1,1,1),
test_m1 = c(0,1,0),
test_m2=c(0,1,1),
l1=c(0,1,0),
l2=c(1,1,1),
m1=c(1,1,1),
m2=c(0,0,1))
Many thanks in advance!!!
Upvotes: 3
Views: 344
Reputation: 21908
Here is a tidyverse solution you can use:
library(dplyr)
df %>%
mutate(across(starts_with("test_"), ~ .x == get(sub("test_", "", cur_column())),
.names = '{gsub("test_", "answer_", .col)}'))
test_l1 test_l2 test_m1 test_m2 l1 l2 m1 m2 answer_l1 answer_l2 answer_m1 answer_m2
1 1 1 0 0 0 1 1 0 FALSE TRUE FALSE TRUE
2 0 1 1 1 1 1 1 0 FALSE TRUE TRUE FALSE
3 0 1 0 1 0 1 1 1 TRUE TRUE FALSE TRUE
Upvotes: 3
Reputation: 388982
Get all the 'test'
columns in test_cols
, remove the string 'test_'
from test_cols
to get the corresponding columns to compare.
Directly compare the two dataframes and create new columns.
test_cols <- grep('test', names(df), value = TRUE)
ans_cols <- sub('test_', '', test_cols)
df[paste0('correct_', ans_cols)] <- df[test_cols] == df[ans_cols]
df
# test_l1 test_l2 test_m1 test_m2 l1 l2 m1 m2 correct_l1 correct_l2 correct_m1 correct_m2
#1 1 1 0 0 0 1 1 0 FALSE TRUE FALSE TRUE
#2 0 1 1 1 1 1 1 0 FALSE TRUE TRUE FALSE
#3 0 1 0 1 0 1 1 1 TRUE TRUE FALSE TRUE
where TRUE
means the answer is correct and FALSE
means answer is wrong.
Upvotes: 1