How does cv.glmnet standardization behavior work? R

Question

I've been attempting to use the cv.glmnet function from the glmnet package in R when making a cv LASSO regression. I want to standardize my explanatory variables as they are all on differing scales. I first used the argument standardize = TRUE and then cross checked the answer by scaling my data beforehand. However, it appears that I'm getting different lambda and MSE results from both methods. Please see below.

set.seed(34064064)

library(glmnet)
library(ISLR2)

Hitters<-na.omit(Hitters)
x <- model.matrix(Salary~.,Hitters)[,-1]
y <- Hitters$Salary

cv<-cv.glmnet(x,y,lambda=exp(seq(-2, 4, length.out = 30)),nfolds=10,alpha=1,standardize = TRUE,type.measure = "mse")

cv$lambda.min

[1] 3.014543

##--- Scaling data beforehand

data<-na.omit(Hitters)

y<-data$Salary

x<-model.matrix(Salary~.,data=data)

x<-x[,-1]

for (i in 1:ncol(x)) {
  if (is.numeric(x[,i])) {
    x[,i] <- (x[,i] - mean(x[,i])) / sd(x[,i])
  }
}


cv<-cv.glmnet(x,y,lambda=exp(seq(-2, 4, length.out = 30)),alpha=1,nfolds=10,standardize = FALSE,type.measure = "mse")

cv$lambda.min

[1] 2.451136

How does cv.glmnet standardization behavior work? R

Answers (0)

Related Questions