Reputation: 83
I would like to mutate several variables at once using mutate_at(). This is how I've been doing up until now, but since I'm dealing with a long list of variables to recode/rename, I want to know how I can do this using mutate_at(). I want to maintain the original columns, which is why I'm not using rename() but mutate() instead. This is what I normally do:
df <- df %>%
mutate(q_50_a = as.numeric(`question_50_part_a: very long very long very long very long` == "yes"),
q_50_b = as.numeric(`question_50_part_b: very long very long very long very long` == "yes"),
q_50_c = as.numeric(`question_50_part_c: very long very long very long very long` == "yes"))
This is what I have so far:
df <- df %>% mutate_at(vars(starts_with("question_50")), funs(q_50 = as.numeric(. == "yes")))
It works and creates a new numeric variable but I'm not sure how to get it to rename the new variables like this: q_50_a, q_50_b, q_50_c, ect.
Thank you.
edit: this is what the data looks like (except there are many many more columns which all look alike)
question_50_part_a: a very long title question_50_part_b: a very long title
yes yes
yes no
yes no
yes yes
no no
yes yes
but would like this:
q_50_a q_50_b
1 1
1 0
1 0
1 1
0 0
1 1
but I want to keep the original columns as they are and simply mutate these new columns with the shorter name and numeric binary coding.
Upvotes: 1
Views: 2111
Reputation: 6931
Here is an approach that loops over each column:
column_names = colnames(df)
# optional filter out column names you don't want to change here
for(col in column_names){
# construct replacement name
col_replace = paste0("q_", substr(col, 10, 11), "_", substr(col, 18, 18))
# assign and drop old column
df = df %>%
mutate(!!sym(col_replace) := ifelse(!!sym(col) == "yes", 1, 0)) %>%
select(-!!sym(col))
}
Points to note:
!!sym(col)
construction takes the text string stored in col
and turns it into a column name.:=
rather than =
because the LHS requires some evaluation before assignment can happen.ifelse
instead of as.numeric
but you can code the RHS of the equals sign as you please.col_replace
makes some assumptions about the format of your input names. If everything is the same length this should work. If the number of characters differ (e.g. Q_9_a and Q_10_a) then you may want to use a method based on strsplit
instead.-
sign in select makes it exclude the specified columnUpvotes: 0
Reputation: 388817
We can use rename_at
to rename the new columns.
library(dplyr)
df %>%
mutate_at(vars(starts_with('question_50')),
list(new = ~as.numeric(. == 'yes'))) %>%
rename_at(vars(ends_with('new')),
~sub('\\w+(_\\d+)_part(\\w+):.*', 'q\\1\\2', .))
# question_50_part_a: a very long title question_50_part_b: a very long title
#1 yes yes
#2 yes no
#3 yes no
#4 yes yes
#5 no no
#6 yes yes
# q_50_a q_50_b
#1 1 1
#2 1 0
#3 1 0
#4 1 1
#5 0 0
#6 1 1
Upvotes: 1