user3496977
user3496977

Reputation: 41

Glmnet. Different results for the same lambda vector, depending on whether it was calculated by glmnet or passed down as a parameter

Glmnet with ridge regularization calculates coefficients for the first lambda value differently when lambda vector is chosen by glmnet algorithm compared to when it is given in a function call. For example, two models (that I would expect to be identical)

> m <- glmnet(rbind(c(1, 0), c(0, 1)), c(1, 0), alpha=0)
> m2 <- glmnet(rbind(c(1, 0), c(0, 1)), c(1, 0), alpha=0, lambda=m$lambda)

give completely different coefficients:

> coef(m, s=m$lambda[1])
3 x 1 sparse Matrix of class "dgCMatrix"
                        1
(Intercept)  5.000000e-01
V1           1.010101e-36
V2          -1.010101e-36

> coef(m2, s=m2$lambda[1])
3 x 1 sparse Matrix of class "dgCMatrix"
                       1
(Intercept)  0.500000000
V1           0.000998004
V2          -0.000998004

The same happens with different datasets too. When lambda is not provided for glmnet, all coefficients for lambda.max coef(m, s=m$lambda[1]) (except for the intercept) are very close to zero and predictions are equal for any X (due to rounding?).

My questions:

  1. Why is this the case? Is the difference intentional?
  2. How exactly are coefficients for the greatest lambda coef(m, s=m$lambda[1]) determined?

Upvotes: 4

Views: 793

Answers (1)

Trevor Hastie
Trevor Hastie

Reputation: 231

This is a tricky one. When alpha=0, the "starting" value of lambda (value when all coefficients except intercept are zero) is infinity. Since we want to produce a grid of values that go to zero geometrically from the starting value, infinity was not much use. So we made it the starting value that would be used when alpha=0.001 (In this case 500), which is the largest lambda seen.

So, in m, the coefficients are really zero, but the largest lambda reported is 500 (meanwhile it really was infinity)

In m2, we actually produce the fit at 500 for the first position, and the coefficients are not quite zero.

To verify what I have said, notice that the subsequent coefficients all match.

Trevor Hastie

Upvotes: 6

Related Questions