Reputation: 975
Please help me understand the re-producible example below. I am trying to write a function glm_func() that would call glm(). It works perfectly fine outside of a function. However, if I pass the linear model formula as an argument, the function glm_func() gives out a strange error:
Error in eval(extras, data, env) : object 'modeldata' not found
Can someone help me understand what went wrong?
# Fully reproducable example
# Specify data
aa = data.frame(y=1:100, x1=1:100, x2=rep(1, 100), z=runif(100))
lm_formula = as.formula('y ~ x1 + x2')
weight_var = 'z'
# GLM works as-is outside of a function
model1 = glm(formula = lm_formula, data = aa, weights = aa[[weight_var]])
# Why does this function not work?
glm_func <- function(modeldata, formula, weight){
thismodel=glm(
formula = formula, #<----- Does not work if formula is passed from argument
data = modeldata, weights = modeldata[[weight]])}
glm_func(modeldata=aa, formula=lm_formula, weight=weight_var)
# This function works
glm_func2 <- function(modeldata, weight){
thismodel=glm(
formula = y ~ x1 + x2, #<----- Works if formula is hardcoded
data = modeldata, weights = modeldata[[weight]])}
glm_func2(modeldata=aa, weight=weight_var)
Upvotes: 1
Views: 97
Reputation: 132676
From help("formula")
:
A formula object has an associated environment, and this environment (rather than the parent environment) is used by
model.frame
to evaluate variables that are not found in the supplieddata
argument.Formulas created with the ~ operator use the environment in which they were created. Formulas created with
as.formula
will use theenv
argument for their environment.
From this one would expect that you don't need to care about the environment if you use the data
argument. Sadly that's not the case here because the weights
are evaluated within the formula's environment (Thanks to useruser2554330 for pointing this out!).
So, you need to ensure that your function environment is associated with the formula:
glm_func <- function(modeldata, formula, weight){
environment(formula) <- environment()
glm(formula = formula, data = modeldata,
weights = modeldata[[weight]])
}
glm_func(modeldata=aa, formula=lm_formula, weight=weight_var)
#works
Personally, I'd do this instead:
glm_func <- function(modeldata, formula, weight){
environment(formula) <- environment()
eval(
bquote(
glm(formula = .(formula), data = modeldata,
weights = modeldata[[weight]])
)
)
}
This way, the actual formula is printed when you print the model object.
Upvotes: 1
Reputation: 388862
As @Roland commented that a formula object has an associated environment so instead of passing a formula object you can pass the variables and create the formula inside the function.
glm_func <- function(modeldata, resp, predictor, weight){
glm(formula = reformulate(predictor, resp),
data = modeldata, weights = modeldata[[weight]])
}
glm_func(modeldata=aa, 'y', c('x1', 'x2'), weight=weight_var)
Upvotes: 0