Reputation: 3883
This question is highly related to R - how to pass formula to a with(df, glm(y ~ x)) construction inside a function but asks a broader question.
Why do these expressions work?
text_obj <- "mpg ~ cyl"
form_obj <- as.formula(text_obj)
with(mtcars, lm(mpg ~ cyl))
with(mtcars, lm(as.formula(text_obj)))
lm(form_obj, data = mtcars)
But not this one?
with(mtcars, lm(form_obj))
Error in eval(predvars, data, env) : object 'mpg' not found
I would usually use the data
argument but this is not possible in the mice
package.
Ie.
library(mice)
mtcars[5, 5] <- NA # introduce a missing value to be imputed
mtcars.imp = mice(mtcars, m = 5)
These don't work
lm(form_obj, data = mtcars.imp)
with(mtcars.imp, lm(form.obj))
but this does
with(mtcars.imp, lm(as.formula(text_obj)))
Thus, is it better to always thus use the as.formula
argument inside the function, rather than construct it first and then pass it in?
Upvotes: 2
Views: 1089
Reputation: 1
There is one addition that I needed with
environment(form_obj) = environment()
form_obj might specify a variable that lives in the environment in which the formula was created. To cover this case one can add the environment in which the formula as a parent, e.g.
ev <- environment()
parent.env(ev) <- environment(form_obj)
environment(form_obj) <- ev
I needed to set the parent environment also when using vegan::rda in the function doPRC in package PRC (https://github.com/CajoterBraak/PRC). The function worked with resetting the environment of the formula in interactive sections, as in the previous solution (thanks!), but not when using Code demos of the package PRC. It did run when I set the parent environment explicitly as in above. The reason may be that in vegan::rda has its own way of dealing with this issue, namely by adding the R global environment within the rda function, but that using Code demos this environment does not contain the new variable. Perhaps, it is more a peculiarity/limitation of the vegan package than a general thing.
Upvotes: 0
Reputation: 34763
An important "hidden" aspect of formulas is their associated environment.
When form_obj
is created, its environment is set to where form_obj
was created:
environment(form_obj)
# <environment: R_GlobalEnv>
For every other version, the formula's environment is created from within with()
, and is set to that temporary environment. It's easiest to see this with the as.formula
approach by splitting it into a few steps:
with(mtcars, {
f = as.formula(text_obj)
print(environment(f))
lm(f)
})
# <environment: 0x7fbb68b08588>
We can make the form_obj
approach work by editing its environment before calling lm
:
with(mtcars, {
# set form_obj's environment to the current one
environment(form_obj) = environment()
lm(form_obj)
})
The help page for ?formula
is a bit long, but there's a section on environments:
Environments
A formula object has an associated environment, and this environment (rather than the parent environment) is used by
model.frame
to evaluate variables that are not found in the supplied data argument.Formulas created with the
~
operator use the environment in which they were created. Formulas created withas.formula
will use theenv
argument for their environment.
The upshot is, making a formula with ~
puts the environment part "under the rug" -- in more general settings, it's safer to use as.formula
which gives you fuller control over the environment to which the formula applies.
You might also check Hadley's chapter on environments:
http://adv-r.had.co.nz/Environments.html
Upvotes: 7