Reputation: 239
I have been trying to create a function that uses a glm() inside it. But I always get an error message. It looks like the function does not retrieve the value of the variable.
set.seed(234)
sex <- sample(c("M", "F"), size=100, replace=TRUE)
age <- rnorm(n=100, mean=20 + 4*(sex=="F"), sd=0.1)
dsn <- data.frame(sex, age)
rm(sex, age) #remove sex and age from the global environment for reproducibility
to_analyze <- function(dep, indep, data){
glm(dep~factor(indep), data=data)
}
to_analyze(dep=age, indep=sex, data=dsn)
#> Error in eval(predvars, data, env): object 'age' not found
Upvotes: 2
Views: 1044
Reputation: 239
@Onyambu and others. The substitute command seems to work well for just one call as it works for the to_analyze(). However when I call another function inside it, it is complaining again. Any help would be greatly appreciated
to_analyze <- function(dep, indep, data){
glm(substitute(dep ~ factor(indep)), data=data)
}
to_analyze(dep=age, indep=sex, data=dsn)
#>
#> Call: glm(formula = substitute(dep ~ factor(indep)), data = data)
#>
#> Coefficients:
#> (Intercept) factor(sex)M
#> 24.006 -4.034
#>
#> Degrees of Freedom: 99 Total (i.e. Null); 98 Residual
#> Null Deviance: 397.3
#> Residual Deviance: 0.8152 AIC: -191.2
However, I am stuck again because I am trying to call the output from this model in lsmeans::lsmeans()
to predict marginal means and return the output but it is giving me an error. Although it does not need an offset, I am including it here so that I can get a more general code that I can modify later. Any help would be greatly appreciated
to_predict_lsmeans <- function(dep, indep, data){
model <- glm(substitute(dep ~ factor(indep)), data=data)
pred <- lsmeans:: lsmeans(model, substitute(~ factor(indep)), offset=substitute(data)$log(age), type ="response" )
return(pred)
}
pred <- to_predict_lsmeans(dep=age, indep=sex, data=dsn)
#> Error in ref_grid(object, ...): We are unable to reconstruct the data.
#> The variables needed are:
#> sex
#> Are any of these actually constants? (specify via 'params = ')
#> The dataset name is:
#> data
#> Does the data still exist? Or you can specify a dataset via 'data = '
pred
#> Error in eval(expr, envir, enclos): object 'pred' not found
Upvotes: 0
Reputation: 79208
You could use any of the following:
Using
substitute
:
to_analyze <- function(dep, indep, data){
glm(substitute(dep ~ factor(indep)), data=data)
}
to_analyze(dep=age, indep=sex, data=dsn)
Advantage: Can write the independent as a formula.
eg
to_analyze(Petal.Width, Sepal.Length + Sepal.Width, data = iris)
Using reformulate as stated by @NelsonGon
to_analyze <- function(dep, indep, data){
glm(reformulate(sprintf("factor(%s)",indep), dep), data = data)
}
Note that to call this function, the variables aught to be of type character
to_analyze(dep= "age", indep="sex", data=dsn)
Recall
glm
can also take a string that can be parsed to a formula:
to_analyze <- function(dep, indep, data){
glm(sprintf("%s~factor(%s)", dep, indep), data = data)
}
to_analyze("age", "sex", data=dsn)
or even:
to_analyze <- function(dep, indep, data){
glm(paste(dep,"~ factor(",indep,")"), data = data)
}
to_analyze("age", "sex", data=dsn)
LASTLY: to combine both the substitute and paste:
to_analyze <- function(dep, indep, data){
glm(paste(substitute(dep),"~ factor(",substitute(indep),")"), data = data)
}
will work for both symbols and characters. eg:
to_analyze(age, sex, data=dsn)
to_analyze("age", "sex", data=dsn)
Upvotes: 3
Reputation: 76402
Create a "formula"
object in the function and pass to glm
.
To get the variables without giving an error the standard trick is deparse(substitute(.))
.
Then compose the formula with paste
.
to_analyze <- function(dep, indep, data){
dep <- deparse(substitute(dep))
indep <- deparse(substitute(indep))
indep <- paste0("factor(", indep, ")")
fmla <- paste(dep, indep, sep = " ~ ")
fmla <- as.formula(fmla)
glm(fmla, data = data)
}
to_analyze(dep=age, indep=sex, data=dsn)
#
#Call: glm(formula = fmla, data = data)
#
#Coefficients:
# (Intercept) factor(sex)M
# 23.984 -3.984
#
#Degrees of Freedom: 99 Total (i.e. Null); 98 Residual
#Null Deviance: 396.2
#Residual Deviance: 0.837 AIC: -188.5
Upvotes: 1