cv.glmnet vs glmnet results; gauging explanatory power

Question

When estimating a lasso model via the glmnet package, I am wondering whether it is better to: (a) pull coefficients / predictions / deviance straight from the cv.fit object procured from cv.glmnet, or (b) use the minimum lambda from cv.glmnet to re-run glmnet and pull these objects from the glmnet process. (Please be patient -- I have a feeling that this is documented, but I'm seeing examples/tutorials of both online, and no solid logic for going one way or the other.)

That is, for coefficients, I can run (a):

cvfit = cv.glmnet(x=xtrain, y=ytrain, alpha=1, type.measure = "mse", nfolds = 20)
coef.cv <- coef(cvfit, s = "lambda.min")

Or I can afterwards run (b):

fit = glmnet(x=xtrain, y=ytrain, alpha=1, lambda=cvfit$lambda.min)
coef <- coef(fit, s = "lambda.min")

While these two processes select the same model variables, they do not produce identical coefficients. Similarly, I could predict via either of the following two processes:

prdct <- predict(fit,newx=xtest)
prdct.cv <- predict(cvfit, newx=xtest, s = "lambda.min")

And they predict similar but NOT identical vectors.

Last, I would have THOUGHT I could pull % deviance explained via either of the two methods:

percdev <- fit$dev.ratio
percdev.cv <- cvfit$glmnet.fit$dev.ratio[cvfit$cvm==mse.min.cereal]

But in fact, it is not possible to pull percdev.cv in this way, because if the lambda sequence used by cv.glmnet has less than 100 elements, the lengths of cvfit$glmnet.fit$dev.ratio and cvfit$cvm==mse.min.cereal don't match. So I'm not quite sure how to pull the minimum-lambda dev.ratio from cvfit$glmnet.fit.

So I guess I'm wondering which process is best, why, and how people normally pull the appropriate dev.ratio statistic. Thanks!

cv.glmnet vs glmnet results; gauging explanatory power

Answers (1)

Related Questions