Reputation: 148
I'm trying to write a function to automate the creation of some new variables using tidyverse tools. I figured out my problem involves tidyeval, but I haven't quite figured out where I went wrong in the code below, which is just reproducing the variable name. As a second step, I'd like to do something besides a for loop to apply the function a bunch of times. I've read enough StackOverflow answers shaming for loops, but I can't find a worked example for using some kind of apply function creating new variables on an existing dataframe. Thanks!
library(tidyverse)
x = c(0,1,2,3,4)
y = c(0,2,4,5,8)
df <- data.frame(x,y)
df
simple_func <- function(x) {
var_name <- paste0("pre_", x, "_months")
var_name <- enquo(var_name)
df <- df %>%
mutate(!! var_name := ifelse(x==y,1,0)) %>%
mutate(!! var_name := replace_na(!! var_name))
return(df)
}
simple_func(1)
#Desired result
temp <- data.frame("pre_1_months" = c(1,0,0,0,0))
temp
bind_cols(df,temp)
#Step 2, use some kind of apply function rather than a loop to apply this function sequentially
nums <- seq(1:10)
for (i in seq_along(nums)) {
df <- simple_func(nums[i])
}
df
Upvotes: 0
Views: 72
Reputation: 173858
To build on @akrun's answer, the more idiomatic way to do this would be to pass df
as the first parameter of your function, and have x as the second. You can vectorize the function by putting the loop inside it to run once for each element in x by using rlang::syms
instead of sym
. It also makes the code shorter, and you can add it into the pipe as if it was a dplyr
function.
simple_func <- function(df, x)
{
for(var_name in rlang::syms(paste0("pre_", x, "_months")))
{
df <- mutate(df, !! var_name := replace_na(ifelse(x==y,1,0)))
}
df
}
So now you can do:
df %>% simple_fun(1:5)
#> x y pre_1_months pre_2_months pre_3_months pre_4_months pre_5_months
#> 1 0 0 1 1 1 1 1
#> 2 1 2 0 0 0 0 0
#> 3 2 4 0 0 0 0 0
#> 4 3 5 0 0 0 0 0
#> 5 4 8 0 0 0 0 0
EDIT
Following the comment from Lionel Henry, and also from noting the OPs desire to avoid loops, here is a single function without loops that can be used in the pipe with x
of an arbitrary length, and which doesn't rely on converting to symbols:
simple_func <- function(df, x) {
f <- function(v) df <<- mutate(df, !!v := replace_na(ifelse(x == y, 1, 0)))
lapply(paste0("pre_", x, "_months"), f)
return(df)
}
This works the same way:
df %>% simple_fun(1:10)
#> x y pre_1_months pre_2_months pre_3_months pre_4_months pre_5_months pre_6_months
#> 1 0 0 1 1 1 1 1 1
#> 2 1 2 0 0 0 0 0 0
#> 3 2 4 0 0 0 0 0 0
#> 4 3 5 0 0 0 0 0 0
#> 5 4 8 0 0 0 0 0 0
#> pre_7_months pre_8_months pre_9_months pre_10_months
#> 1 1 1 1 1
#> 2 0 0 0 0
#> 3 0 0 0 0
#> 4 0 0 0 0
#> 5 0 0 0 0
Upvotes: 1
Reputation: 887148
As it is a string, we can use sym
to convert to symbol and then evaluate (!!
simple_func <- function(x) {
var_name <- paste0("pre_", x, "_months")
var_name <- rlang::sym(var_name)
df %>%
mutate(!! var_name := ifelse(x==y,1,0)) %>%
mutate(!! var_name := replace_na(!! var_name))
}
checking with OP's code
nums <- seq(1:10)
for (i in seq_along(nums)) {
df <- simple_func(nums[i])
}
df
# x y pre_1_months pre_2_months pre_3_months pre_4_months pre_5_months pre_6_months pre_7_months pre_8_months
#1 0 0 1 1 1 1 1 1 1 1
#2 1 2 0 0 0 0 0 0 0 0
#3 2 4 0 0 0 0 0 0 0 0
#4 3 5 0 0 0 0 0 0 0 0
#5 4 8 0 0 0 0 0 0 0 0
# pre_9_months pre_10_months
#1 1 1
#2 0 0
#3 0 0
#4 0 0
#5 0 0
We could use map
and change the mutate
to transmute
simple_func <- function(x) {
var_name <- paste0("pre_", x, "_months")
var_name <- rlang::sym(var_name)
df %>%
transmute(!! var_name := ifelse(x==y,1,0)) %>%
transmute(!! var_name := replace_na(!! var_name))
}
library(purrr)
library(dplyr)
map_dfc(1:10, simple_func) %>%
bind_cols(df,.)
Upvotes: 1