Reputation: 43
I am trying to get a data frame with different variables and run a linear model for each combination of those variables.
A simple example is:
names <- c("Var1", "Var2", "Var3")
vars <- ggm::powerset(names, sort = T, nonempty = T)
The powerset
function gives me all the combinations of the 3 variables -- a list with 7 elements, each element is of type character. (the actual code I am trying to run has 16 variables, that's why I don´t want to manually write each of the models).
What I would like to do now is run the models in each of these combinations of variables. For now, I have written the following code (it does not work, but might be a good start):
i <- 1
for (dep_var in vars){
assign(paste0("modelo", i), lm(lapply(paste("Y", dep_var, sep = "~"), formula),
data = data))
i <- i+1
}
Ultimately, I would like to create different models and select the best model from a combination of in sample AIC and BIC and out of sample MAE and RMSE.
Thank you very much! Any help is appreciated!
Upvotes: 2
Views: 216
Reputation: 1086
A simple example working with mtcars
data with mpg
as dependent variable and the collection of AIC for model selection. I agree that there are lot of R packages for testing multiple models according to some criterion (bestglm
, leaps
, glmulti
, MASS
).
mtcars_exp <- mtcars %>%
select(-mpg)
vars <- ggm::powerset(names(mtcars_exp), sort = T, nonempty = T)
results <- data.frame(AIC_lm_fit = vector("numeric"),
model = vector("character"))
For loop for fitting linear models
for (j in 1:length(vars)){
lm_fit <- lm(mpg ~ ., data = mtcars[,c("mpg", vars[[j]])])
results <- rbind(results,
data.frame(AIC_lm_fit = AIC(lm_fit),
model = toString(vars[[j]])))
}
Selection of the model with the minimum AIC
results[AIC_lm_fit==min(AIC_lm_fit),]
AIC_lm_fit model
105 154.1194 wt, qsec, am
Comparison with the results from MASS::stepAIC function
library(MASS)
lm_fit <- lm(mpg ~ ., data = mtcars)
lm_fit_2 <- stepAIC(lm_fit)
lm_fit_2$call
lm(formula = mpg ~ wt + qsec + am, data = mtcars)
Upvotes: 3