Thomas Philips
Thomas Philips

Reputation: 1089

dplyr Modify multiple columns based on a single column in a dataframe

I have a large dataframe with responses to a questionnaire. My minimal working example (below) has the responses to 3 questions as well as the delay in responding to the questionnaire from which the answers are drawn

df <- data.frame(ID = LETTERS[1:10],
                 Q1  = sample(0:10, 10, replace=T),
                 Q2  = sample(0:10, 10, replace=T),
                 Q3  = sample(0:10, 10, replace=T),
                 Delay = 1:10
)

I'd like to change the responses with a delay > 3 to NA's. I can accomplish this easily enough for a single question:

df %>%
  mutate(Q1 = ifelse(Delay >3, NA, Q1))

which gives me

   ID Q1 Q2 Q3 Delay
1   A  5  6  9     1
2   B  8  1  5     2
3   C  8  4  6     3
4   D NA  7  1     4
5   E NA  8 10     5
6   F NA  9  4     6
7   G NA  1  6     7
8   H NA  8  9     8
9   I NA  9  1     9
10  J NA  5  7    10

I'd like instead to do this for all three questions with one statement (in my real life problem, I have over 20 questions, so it's tedious to do each question separately). I therefore create a vector of questions:

q_vec <- c("Q1", "Q2", "Q3")

and then tried variants of my earlier code such as

df %>%
  mutate(all_of(q_vec) = ifelse(Delay >3, NA, ~))

but nothing worked. What is the correct syntax for this?

Many thanks in advance

Thomas Philips

Upvotes: 0

Views: 354

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389235

We can use across :

library(dplyr)

q_vec <- c("Q1", "Q2", "Q3")
df %>% mutate(across(all_of(q_vec), ~ifelse(Delay >3, NA, .)))

#   ID Q1 Q2 Q3 Delay
#1   A  1  5  0     1
#2   B  9  9  6     2
#3   C  5  7  1     3
#4   D NA NA NA     4
#5   E NA NA NA     5
#6   F NA NA NA     6
#7   G NA NA NA     7
#8   H NA NA NA     8
#9   I NA NA NA     9
#10  J NA NA NA    10

Or in base R :

df[q_vec][df$Delay > 3, ] <- NA

Upvotes: 2

Related Questions