Reputation: 2213
So I have the following data. I would like to pass the column name (string) as parameters into my_func
and within the function, the string variables will be converted to variable shown in option (1) below. I know I can do option (2) but I would like to know how to do it using option (1).
And finally passing the new column name as a parameter and assigning it to xts as a new column.
df_xts <- data.frame(date = structure(c(1167667200, 1167753600, 1167840000, 1167926400, 1168012800,
1168099200, 1168185600, 1168272000, 1168358400, 1168444800, 1168531200,
1168617600, 1168704000, 1168790400, 1168876800, 1168963200, 1169049600,
1169136000, 1169222400, 1169308800, 1169395200, 1169481600, 1169568000,
1169654400, 1169740800, 1169827200, 1169913600, 1.17e+09, 1170086400
), tzone = "", tclass = c("POSIXct", "POSIXt"), class = c("POSIXct", "POSIXt")),x=1:29,y1=rnorm(29),y2=rnorm(29,2,2),y3=rnorm(29,3,3),y4=rnorm(29,4,4))
df_xts <- as.xts(df_xts[,c(2:5)],order.by=df_xts$date)
my_func <- function(x,y,y_new,df){
# option (1) how do I convert string variables in the arguments to variables such that i can plug into the formula ?
lr <- lm(y ~ ns(x,df=5),data=df)
# option (2) I know I can do it this way buut this is not what i want. I want to know how to do in the way above?
lr <- lm(df[,c(y)] ~ ns(df[,c(x)],df=5))
# finally assign new column to xts object
df$y_new <- predict(lr, newdata=df$x,se=T)
return(df)
}
my_func(x='x',y='y1',y_new = 'y1_new',df=df_xts)
Ultimately, I want to lapply
the function above across c("y1","y2","y3","y4")
.
Upvotes: 2
Views: 1004
Reputation: 41220
You can use ensym
from rlang
package which allows you to pass arguments as string
or as symbol
to the function, and then substitute
them before eval
uation :
my_func <- function(x,y,y_new,df){
x <- rlang::ensym(x)
y <- rlang::ensym(y)
y_new <- rlang::ensym(y_new)
lr <- eval(substitute(lm(y ~ splines::ns(x,df=5),data=df),list(x=x,y=y)))
# finally assign new column to xts object
eval(substitute(df$y_new <- predict(lr),list(y_new = y_new)))
return(df)
}
> my_func(x='x',y='y1',y_new = 'y1_new',df=df_xts)
date x y1 y2 y3 y4 y1_new
1 2007-01-01 17:00:00 1 0.8104089 -2.76764194 1.5904420 1.6583122 1.34258946
2 2007-01-02 17:00:00 2 1.3416652 3.97757263 6.2622732 8.3300956 0.84683353
3 2007-01-03 17:00:00 3 0.6925525 1.97349693 1.1367611 3.9290304 0.38163911
4 2007-01-04 17:00:00 4 -0.3231760 4.82490196 5.8738266 2.8540564 ...
This also works with symbols instead of strings :
my_func(x = x, y = y1, y_new = y1_new, df = df_xts)
It can be useful to run the function step by step to better understand what is happening here :
ensym
transforms the inputs into symbol
s:x = 'x'
y = 'y1'
x <- rlang::ensym(x)
y <- rlang::ensym(y)
> x
x
> y
y1
substitute
replaces the symbol
s in the expression according to the list(x=x,y=y)
and creates a new expression :> substitute(lm(y ~ splines::ns(x,df=5),data=df),list(x=x,y=y))
lm(y1 ~ splines::ns(x, df = 5), data = df)
eval
evaluates the newly formed expression :> eval(substitute(lm(y ~ splines::ns(x,df=5),data=df),list(x=x,y=y)))
Call:
lm(formula = y1 ~ splines::ns(x, df = 5), data = df)
Coefficients:
(Intercept) splines::ns(x, df = 5)1 splines::ns(x, df = 5)2
1.3426 -0.2424 -2.2221
splines::ns(x, df = 5)3 splines::ns(x, df = 5)4 splines::ns(x, df = 5)5
-0.6453 -3.4297 0.7092
This technique is extensively used in packages like ggplot2
, see quasinotation:
library(ggplot2)
ggplot(df_xts)+geom_point(aes(x=x,y=y1))
Upvotes: 4