Reputation: 197
This is an example data.
ind1 <- rnorm(99)
ind2 <- rnorm(99)
ind3 <- rnorm(99)
ind4 <- rnorm(99)
ind5 <- rnorm(99)
dep <- rnorm(99, mean=ind1)
group <- rep(c("A", "B", "C"), each=33)
df <- data.frame(dep,group, ind1, ind2, ind3, ind4, ind5)
Here simple linear regression model has been fitted on every combination of variables in df after grouped by categorical variable. The result is satisfied. But my original data has much more than 5 variables. It is hard to see and compare the results in this list. So I would like to choose the best 5 models for each group from the resulting list (tibble_list) based on AIC value. It will be highly appreciated if someone could help me to do so.
indvar_list <- lapply(1:5, function(x)
combn(paste0("ind", 1:5), x, , simplify = FALSE))
formulas_list <- rapply(indvar_list, function(x)
as.formula(paste("dep ~", paste(x, collapse="+"))))
run_model <- function(f) {
df %>%
nest(-group) %>%
mutate(fit = map(data, ~ lm(f, data = .)),
results1 = map(fit, glance),
results2 = map(fit, tidy)) %>%
unnest(results1) %>%
unnest(results2) %>%
select(group, term, estimate, r.squared, p.value, AIC) %>%
mutate(estimate = exp(estimate))
}
tibble_list <- lapply(formulas_list, run_model)
tibble_list
Upvotes: 1
Views: 43
Reputation: 887223
An option would be to bind the rows into a single dataset with a .id
column, then arrange
by 'group', 'AIC', grouped by 'group', filter
the rows having the first five unique
'index'
library(tidyverse)
bind_rows(tibble_list, .id = 'index') %>%
arrange(group, AIC) %>%
group_by(group) %>%
filter(index %in% head(unique(index), 5))
# A tibble: 51 x 7
# Groups: group [3]
# index group term estimate r.squared p.value AIC
# <chr> <fct> <chr> <dbl> <dbl> <dbl> <dbl>
# 1 1 A (Intercept) 0.897 0.319 0.000620 79.5
# 2 1 A ind1 2.07 0.319 0.000620 79.5
# 3 7 A (Intercept) 0.883 0.358 0.00129 79.5
# 4 7 A ind1 2.14 0.358 0.00129 79.5
# 5 7 A ind3 0.849 0.358 0.00129 79.5
# 6 8 A (Intercept) 0.890 0.351 0.00153 79.9
# 7 8 A ind1 2.12 0.351 0.00153 79.9
# 8 8 A ind4 0.860 0.351 0.00153 79.9
# 9 19 A (Intercept) 0.877 0.387 0.00237 80.0
#10 19 A ind1 2.18 0.387 0.00237 80.0
## … with 41 more rows
Upvotes: 1