Grammilo
Grammilo

Reputation: 1369

How to extract the correct model using step() in R for BIC criteria?

I am trying to select the best model using AIC and BIC criterion using Boston dataset. I have got the best model according to AIC criteria. For the BIC criteria I used the following code:

# Selecting best possible model using BIC
lm.BIC<-step(lm.null,scope = list(lower = lm.null, upper = lm.full), direction = "both", trace = TRUE, k = log(nrow(training)))

The output of the above code is below:

enter image description here

It spit out models with decreasing AIC values. I checked the BIC values for the last model (call model 1 medv ~ lstat + rm + ptratio + black + dis + nox) and another model (call model 2 medv ~ lstat + rm + ptratio + black + dis + nox + rad + tax + zn). Model 2 with three extra variables. I found that BIC values for model 2 is less than model 1. So I am confused how to extract the best BIC model using the above line of R code and its output when all it shows are AIC values, which I don't want to judge my models on.

Thank You

Upvotes: 2

Views: 4818

Answers (1)

L Steyn
L Steyn

Reputation: 21

Refer to the AIC function with with the code

?AIC 

There you will find that the AIC is defined as -2*L + k*npar, where L is the log-likelihood, npar is the number of parameters in the fitted model and k = 2 strictly for the AIC. The BIC is defined as -2*L + log(n)*npar. That is, for the BIC we have that k = log(n) with n the number of observations. Therefore, your code is correct and does in fact calculate BIC. The following code finds the best model based on BIC:

opt_step = step(lm.null,scope = list(lower = lm.null, 
upper = lm.full), direction = "both", trace = TRUE, 
k = log(nrow(training)))

BIC = opt_step$anova$AIC
print(BIC)

Although labelled AIC, since k was changed from its default value the step function is calculating the BIC.

Upvotes: 2

Related Questions