Reputation: 13
I've been attempting to use the cv.glmnet function from the glmnet package in R when making a cv LASSO regression. I want to standardize my explanatory variables as they are all on differing scales. I first used the argument standardize = TRUE
and then cross checked the answer by scaling my data beforehand. However, it appears that I'm getting different lambda and MSE results from both methods. Please see below.
set.seed(34064064)
library(glmnet)
library(ISLR2)
Hitters<-na.omit(Hitters)
x <- model.matrix(Salary~.,Hitters)[,-1]
y <- Hitters$Salary
cv<-cv.glmnet(x,y,lambda=exp(seq(-2, 4, length.out = 30)),nfolds=10,alpha=1,standardize = TRUE,type.measure = "mse")
cv$lambda.min
[1] 3.014543
##--- Scaling data beforehand
data<-na.omit(Hitters)
y<-data$Salary
x<-model.matrix(Salary~.,data=data)
x<-x[,-1]
for (i in 1:ncol(x)) {
if (is.numeric(x[,i])) {
x[,i] <- (x[,i] - mean(x[,i])) / sd(x[,i])
}
}
cv<-cv.glmnet(x,y,lambda=exp(seq(-2, 4, length.out = 30)),alpha=1,nfolds=10,standardize = FALSE,type.measure = "mse")
cv$lambda.min
[1] 2.451136
Upvotes: 0
Views: 49