Reputation: 1294
I am learning how to run functions. Hopefully, this would be an easy question to answer.
I have a df
and I want to pass the function w
to some of the variables in the df.
df <- data.frame(id= c(1,1,1,2,2,2,3,3,3), time=c(1,2,3,1,2,3,1,2,3),y = rnorm(9), x1 = rnorm(9), x2 = c(0,0,0,0,1,0,1,1,1),c2 = rnorm(9))
library(data.table)
library(dplyr)
w<-function(data,var1,var2){
x <- substitute(var1)
y <- substitute(var2)
data<-setDT(data)[,paste("times",(var1), (var2), sep = "_"):=eval(var1)*eval(var2)]
}
df2<- w(df,y,x1)
When I apply the function to a single variable the function works. but I would like to apply it to a series of variables in my data frame, but for some reason, the function fails when I try to apply it to multiple variables at once. does anyone know how I could make it work?
So far I tried the following
vars<-c("x1","x2")
df3<- lapply(vars, function(x) w(df,y, x))
thanks a lot for your help
Upvotes: 0
Views: 46
Reputation: 6226
data.table
works well with variable names. You use get
to unquote names and get them evaluated in the scope of the data.table
object. I wrote a blog post exactly on that subject, if it can help you.
df <- data.frame(id= c(1,1,1,2,2,2,3,3,3), time=c(1,2,3,1,2,3,1,2,3),y = rnorm(9), x1 = rnorm(9), x2 = c(0,0,0,0,1,0,1,1,1),c2 = rnorm(9))
library(data.table)
setDT(df)
Your function can be simplified to:
w <- function(data, var1, var2){
if (!inherits(data, "data.table")){
setDT(data)
}
data[,(paste("times",var1,var2, sep = "_")) := get(var1)*get(var2)]
}
And you call it by using variable names
vars<-c("x1","x2")
lapply(vars, function(x) w(df,"y", x))
df
id time y x1 x2 c2 times_y_x1 times_y_x2
1: 1 1 -0.81438357 0.4493933 0 -0.39143328 -0.3659786 0.0000000
2: 1 2 0.36358498 -1.3574671 0 0.06062278 -0.4935547 0.0000000
3: 1 3 0.04049807 0.2860555 0 1.58123937 0.0115847 0.0000000
4: 2 1 0.15490901 -0.8654069 0 -1.09874917 -0.1340593 0.0000000
5: 2 2 -0.87899821 0.2863604 1 -0.73161360 -0.2517103 -0.8789982
6: 2 3 0.37881104 1.6135654 0 1.30268569 0.6112364 0.0000000
7: 3 1 -0.72990680 0.5867623 1 0.41856548 -0.4282818 -0.7299068
8: 3 2 -0.53344035 0.5073415 1 0.64326809 -0.2706364 -0.5334404
9: 3 3 -0.27674109 -0.5226920 1 -2.28723895 0.1446504 -0.2767411
Note that you update your dataframe since you use :=
so you don't need to reassign the output
Upvotes: 1