Reputation: 193
I am trying to create a reusable function. In the function, I would like to self-define a formula and then test the formula with the lm
in my function and output the summary of the regression results.
I have tried using as.formula
function to create my own formula in my self-defined function, but I get error messages with the following codes, not sure why, could anyone help me?
# create the data
x <- c(1,2,3,5,6,7,8,1,1,2,1)
y <- c(2,3,4,5,1,3,4,5,6,7,2)
z <- c(2,3,4,1,2,3,33,5,2,4,5)
i <- c(2,4,4,5,1,3,2,5,6,7,2)
j <- c(2,9,4,1,2,3,4,5,2,4,5)
k <- c(2,12,4,5,1,3,4,5,6,7,2)
q <- c(2,55,4,1,2,5,4,5,2,4,5)
m <- data.frame(x,y,z)
# the function
polyRegress <- function(pre1, pre2, dv, df){
# This is the formula I want to test:
# model <- lm(z ~ x + y + I(x^2) + I(x*y) + I(y^2), data=m)
f <- as.formula(paste0(dv, " ~ ", pre1, " + ", pre2, " + ", "I(", pre1, "^2)", " + ", "I(", pre1, "*", pre2, ")", " + ", "I(", pre2, "^2)")
results <- lm(f, data=df)
summary(results)
}
# main
polyRegress(x, y, z, m)
polyRegress(i, j, k, m)
Also, in the outputs from the two polyRegress
functions above, I want the names of the coefficients being x, y, I(x^2), I(x * y), I(y^2)
and i, j, I(i^2), I(i * j), I(j^2)
, rather than pre1, pre2, I(pre1^2), I(pre1 * pre2), I(pre2^2)
Upvotes: 1
Views: 2688
Reputation: 141
With your example i think you don't need to have df argument because x,y,z,i ... are vectors.
When you call polyRegress(x, y, z, m) you use x,y and z vectors not the colnames in m.
So, in the first case you can use solutions give by using substitute to get argument name with to change coefficient's names.
# create the data
x <- c(1,2,3,5,6,7,8,1,1,2,1)
y <- c(2,3,4,5,1,3,4,5,6,7,2)
z <- c(2,3,4,1,2,3,33,5,2,4,5)
i <- c(2,4,4,5,1,3,2,5,6,7,2)
j <- c(2,9,4,1,2,3,4,5,2,4,5)
k <- c(2,12,4,5,1,3,4,5,6,7,2)
q <- c(2,55,4,1,2,5,4,5,2,4,5)
m <- data.frame(x,y,z)
# the function
polyRegress <- function(pre1, pre2, dv){
# change pre1 by "x" or "i" ...
pre1 <- deparse(substitute(pre1))
pre2 <- deparse(substitute(pre2))
dv <- deparse(substitute(dv))
f <- paste0(dv, " ~ ", pre1, " + ", pre2, " + ", "I(", pre1, "^2)", " + ", "I(", pre1, "*", pre2, ")", " + ", "I(", pre2, "^2)")
results <- lm(f)
# at this step results$call = lm(formula = f), let's change it !
results$call <- call('lm', formula = formula(f))
summary(results)
}
# main
polyRegress(x, y, z)
polyRegress(i, j, k)
But if you really want to call variable in your dataframe you have to change your arguments by character. Beaucause you want to use dataframe's colnames.
# create the data
m <- data.frame(x,y,z,i,j,k)
rm(x,y,z,i,j,k)
# the function
polyRegress <- function(pre1, pre2, dv, df){
f <- paste0(dv, " ~ ", pre1, " + ", pre2, " + ", "I(", pre1, "^2)", " + ", "I(", pre1, "*", pre2, ")", " + ", "I(", pre2, "^2)")
results <- lm(f, data = df)
# at this step results$call = lm(formula = f, data = df), let's change it !
results$call <- call('lm', formula = formula(f), data = substitute(df))
summary(results)
}
# main
polyRegress("x", "y", "z", m)
polyRegress("i", "j", "k", m)
I hope i understand your demand.
Upvotes: 1