Reputation: 386
When estimating a lasso model via the glmnet package, I am wondering whether it is better to: (a) pull coefficients / predictions / deviance straight from the cv.fit object procured from cv.glmnet
, or (b) use the minimum lambda from cv.glmnet
to re-run glmnet
and pull these objects from the glmnet
process. (Please be patient -- I have a feeling that this is documented, but I'm seeing examples/tutorials of both online, and no solid logic for going one way or the other.)
That is, for coefficients, I can run (a):
cvfit = cv.glmnet(x=xtrain, y=ytrain, alpha=1, type.measure = "mse", nfolds = 20)
coef.cv <- coef(cvfit, s = "lambda.min")
Or I can afterwards run (b):
fit = glmnet(x=xtrain, y=ytrain, alpha=1, lambda=cvfit$lambda.min)
coef <- coef(fit, s = "lambda.min")
While these two processes select the same model variables, they do not produce identical coefficients. Similarly, I could predict via either of the following two processes:
prdct <- predict(fit,newx=xtest)
prdct.cv <- predict(cvfit, newx=xtest, s = "lambda.min")
And they predict similar but NOT identical vectors.
Last, I would have THOUGHT I could pull % deviance explained via either of the two methods:
percdev <- fit$dev.ratio
percdev.cv <- cvfit$glmnet.fit$dev.ratio[cvfit$cvm==mse.min.cereal]
But in fact, it is not possible to pull percdev.cv
in this way, because if the lambda sequence used by cv.glmnet has less than 100 elements, the lengths of cvfit$glmnet.fit$dev.ratio
and cvfit$cvm==mse.min.cereal
don't match. So I'm not quite sure how to pull the minimum-lambda dev.ratio from cvfit$glmnet.fit
.
So I guess I'm wondering which process is best, why, and how people normally pull the appropriate dev.ratio statistic. Thanks!
Upvotes: 6
Views: 1456
Reputation: 46908
As pointed out in the comments, it has to do with the lambda sequence provided, because if you look under the source code of cv.glmnet
, it calls glmnet:::cv.glmnet.raw
which in the first few lines, run glmnet()
on the defined lambda values.
So we can use an example below:
library(mlbench)
data(BostonHousing)
data = BostonHousing
data$chas=as.numeric(data$chas)
cvfit = cv.glmnet(x=as.matrix(data[,-14]),y=data[,14])
coef.cv <- coef(cvfit, s = "lambda.min")
fit = glmnet(x=as.matrix(data[,-14]), y=data[,14], alpha=1, lambda=cvfit$lambda.min)
coef <- coef(fit, s = "lambda.min")
head(cbind(coef.cv,coef))
6 x 2 sparse Matrix of class "dgCMatrix"
1 1
(Intercept) 31.74123706 31.86654225
crim -0.09834634 -0.09869320
zn 0.04144161 0.04158829
indus . .
chas 2.68518774 2.68163334
nox -16.30664523 -16.35459059
They are slightly different, and if you provide the lambda sequence used in cv.glmnet:
fit = glmnet(x=as.matrix(data[,-14]), y=data[,14], alpha=1, lambda=cvfit$lambda)
coef <- coef(fit, s = cvfit$lambda.min)
head(cbind(coef.cv,coef))
6 x 2 sparse Matrix of class "dgCMatrix"
1 1
(Intercept) 31.74123706 31.74123706
crim -0.09834634 -0.09834634
zn 0.04144161 0.04144161
indus . .
chas 2.68518774 2.68518774
nox -16.30664523 -16.30664523
They are the same now. And the dev.ratio will match too:
fit$dev.ratio[fit$lambda==cvfit$lambda.min]
[1] 0.7401482
cvfit$glmnet.fit$dev.ratio[which.min(cvfit$cvm)]
[1] 0.7401482
Upvotes: 3