Reputation: 197
Suppose I have following df.
ind1 <- rnorm(100)
ind2 <- rnorm(100)
ind3 <- rnorm(100)
ind4 <- rnorm(100)
ind5 <- rnorm(100)
dep <- rnorm(100, mean=ind1)
df <- data.frame(dep,ind1, ind2, ind3, ind4, ind5)
I calculated 3rd order polynomial regression between dep variable and each of the ind variables separately using lapply() as follow:
polys <- lapply(df[,-1], function(x) summary(lm(dep ~ poly(x, 3, raw =
TRUE), data = df)))
Now I want to list/order best models based on AIC. I tried these without success.
stepAIC(polys)
polys$AIC
Do you have any idea how I should do this in R?
Thanks in advance!
Upvotes: 1
Views: 1365
Reputation: 1529
Here is one solution:
ind1 <- rnorm(100)
ind2 <- rnorm(100)
ind3 <- rnorm(100)
ind4 <- rnorm(100)
ind5 <- rnorm(100)
dep <- rnorm(100, mean=ind1)
df <- data.frame(dep,ind1, ind2, ind3, ind4, ind5)
# Create all the models
polys <- lapply(df[,-1], function(x) (lm(dep ~ poly(x, 3, raw = TRUE), data = df)))
# Find the best predictors using `setpAIC()`
mod <- lapply(polys, MASS::stepAIC)
# Calculate the AIC of each model and sort it increasingly.
i <- order(unlist(lapply(mod, AIC)))
# Show the models starting with the lowest AIC first
mod[i]
Upvotes: 1
Reputation: 107652
Currently, you are conflating list objects with methods stepAIC
and AIC
that run on specific components. Specifically,
The function stepAIC
in MASS
package runs on individual model objects (e.g., lm
class types), not summary of model objects and not list of model objects (i.e., polys) with your failed attempt: stepAIC(polys)
.
The function AIC
in MASS
package runs on individual model objects usually after the stepAIC
call and is not an item of the return with your failed attempt: polys$AIC
.
Consider the following re-factoring of your code with seeded data for reproducibility to sort list of lm objects, polys, by ascending AIC values:
library(MASS)
set.seed(50619)
#... same data build code ...
# LIST OF lm OBJECTS (NO SUMMARY)
polys <- lapply(df[,-1], function(x) lm(dep ~ poly(x, 3, raw = TRUE), data = df))
# SORTED VECTOR OF AIC VALUES (USING sapply)
AIC_vec <- sort(sapply(polys, function(x) AIC(stepAIC(x))))
AIC_vec
# ind1 ind3 ind2 ind4 ind5
# 297.1865 352.3694 352.8260 352.8260 352.8260
# SORTED polys BY AIC (BEST TO WORST)
sorted_polys <- polys[names(AIC_vec)]
Upvotes: 2