Reputation: 850
Please consider the following.
I want to use lapply()
to subsequently apply several function arguments stored in a character vector to some other function. A minimal reproducible example could be to apply two or more "families" to the glm()
function. Please note that the example might be nonsensical for applying such families and is used for illustration purposes only.
The following is taken from the example in ?glm()
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
data.frame(treatment, outcome, counts) # showing data
We can now run a GLM with family "gaussian" or "poisson"
glm(counts ~ outcome + treatment, family = "gaussian")
glm(counts ~ outcome + treatment, family = "poisson")
This could also be "automated" by creating a character vector with these family names:
families <- c("poisson", "gaussian")
And using this in an lapply()
function.
But once this runs, the returned function call does not return the family names anymore but the anonymous function argument x
.
lapply(families, function(x) glm(counts ~ outcome + treatment, family = x))
#> [[1]]
#>
#> Call: glm(formula = counts ~ outcome + treatment, family = x)
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 3.045e+00 -4.543e-01 -2.930e-01 -3.242e-16 -2.148e-16
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 10.58
#> Residual Deviance: 5.129 AIC: 56.76
#>
#> [[2]]
#>
#> Call: glm(formula = counts ~ outcome + treatment, family = x)
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 2.100e+01 -7.667e+00 -5.333e+00 2.221e-15 2.971e-15
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 176
#> Residual Deviance: 83.33 AIC: 57.57
Question:
How can the family names from the vector families
be preserved/shown in the function call after lapply()
?
Desired outcome: The outcome should look like this:
#> [[1]]
#>
#> Call: glm(formula = counts ~ outcome + treatment, family = "gaussian")
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 3.045e+00 -4.543e-01 -2.930e-01 -3.242e-16 -2.148e-16
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 10.58
#> Residual Deviance: 5.129 AIC: 56.76
#>
#> [[2]]
#>
#> Call: glm(formula = counts ~ outcome + treatment, family = "poisson")
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 2.100e+01 -7.667e+00 -5.333e+00 2.221e-15 2.971e-15
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 176
#> Residual Deviance: 83.33 AIC: 57.57
I tried eval(bquote(x))
as suggested here: R: Passing named function arguments from vector, but this did not work. See:
lapply(families, function(x) glm(counts ~ outcome + treatment, family = eval(bquote(x))))
#> [[1]]
#>
#> Call: glm(formula = counts ~ outcome + treatment, family = eval(bquote(x)))
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 3.045e+00 -4.543e-01 -2.930e-01 -3.242e-16 -2.148e-16
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 10.58
#> Residual Deviance: 5.129 AIC: 56.76
#>
#> [[2]]
#>
#> Call: glm(formula = counts ~ outcome + treatment, family = eval(bquote(x)))
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 2.100e+01 -7.667e+00 -5.333e+00 2.221e-15 2.971e-15
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 176
#> Residual Deviance: 83.33 AIC: 57.57
Created on 2022-07-22 by the reprex package (v2.0.1)
Thank you!
Upvotes: 2
Views: 53
Reputation: 25528
Yet another possible solution:
families <- c("gaussian", "poisson")
lapply(families, \(x) eval(parse(text=paste0("glm(counts ~ outcome + treatment,
df, family = ", x, ")"))))
#> [[1]]
#>
#> Call: glm(formula = counts ~ outcome + treatment, family = gaussian,
#> data = df)
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 2.100e+01 -7.667e+00 -5.333e+00 8.729e-16 7.252e-16
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 176
#> Residual Deviance: 83.33 AIC: 57.57
#>
#> [[2]]
#>
#> Call: glm(formula = counts ~ outcome + treatment, family = poisson,
#> data = df)
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 3.045e+00 -4.543e-01 -2.930e-01 1.011e-15 7.105e-16
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 10.58
#> Residual Deviance: 5.129 AIC: 56.76
Upvotes: 2
Reputation: 174536
A more direct way to do this would be to build and evaluate the call directly inside lapply
lapply(families, function(x) {
eval(as.call(list(quote(glm),
formula = counts ~ outcome + treatment,
data = quote(df),
family = x)))
})
#> [[1]]
#>
#> Call: glm(formula = counts ~ outcome + treatment, family = "poisson",
#> data = df)
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 3.045e+00 -4.543e-01 -2.930e-01 1.338e-15 1.421e-15
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 10.58
#> Residual Deviance: 5.129 AIC: 56.76
#>
#> [[2]]
#>
#> Call: glm(formula = counts ~ outcome + treatment, family = "gaussian",
#> data = df)
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 2.100e+01 -7.667e+00 -5.333e+00 2.056e-16 7.252e-16
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 176
#> Residual Deviance: 83.33 AIC: 57.57
Created on 2022-07-22 by the reprex package (v2.0.1)
Upvotes: 3
Reputation: 7297
An approach could be to extract the family name and add it to the formula within each model object. For instance like this:
lapply(families, \(fam) { model <- glm(counts ~ outcome + treatment, family = fam); model$call[3] <- model$family$family; return(model)})
Output:
[[1]]
Call: glm(formula = counts ~ outcome + treatment, family = "poisson")
Coefficients:
(Intercept) outcome2 outcome3 treatment2 treatment3
3.045e+00 -4.543e-01 -2.930e-01 -3.242e-16 -2.148e-16
Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
Null Deviance: 10.58
Residual Deviance: 5.129 AIC: 56.76
[[2]]
Call: glm(formula = counts ~ outcome + treatment, family = "gaussian")
Coefficients:
(Intercept) outcome2 outcome3 treatment2 treatment3
2.100e+01 -7.667e+00 -5.333e+00 2.221e-15 2.971e-15
Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
Null Deviance: 176
Residual Deviance: 83.33 AIC: 57.57
Depending on the purpose, you could also (just) name your elements in your vector and each list element would have its name.
families <- c(poisson = "poisson", gaussian = "gaussian")
lapply(families, function(x) glm(counts ~ outcome + treatment, family = x))
Output:
$poisson
Call: glm(formula = counts ~ outcome + treatment, family = x)
Coefficients:
(Intercept) outcome2 outcome3 treatment2 treatment3
3.045e+00 -4.543e-01 -2.930e-01 -3.242e-16 -2.148e-16
Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
Null Deviance: 10.58
Residual Deviance: 5.129 AIC: 56.76
$gaussian
Call: glm(formula = counts ~ outcome + treatment, family = x)
Coefficients:
(Intercept) outcome2 outcome3 treatment2 treatment3
2.100e+01 -7.667e+00 -5.333e+00 2.221e-15 2.971e-15
Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
Null Deviance: 176
Residual Deviance: 83.33 AIC: 57.57
Update with approach 1.
Upvotes: 3