nickolakis
nickolakis

Reputation: 621

Simultaneously multiple LASSO regressions in R

I'm trying to perform multiple LASSO regression in R. To calculate the coefficients for a model, I use the following code

library(glmnet)
A <- as.matrix(data)
fit_lasso <- glmnet(A[,-1] , A[,1] , standardize = TRUE , alpha = 0.9) #LASSO model
print(fit_lasso) #LASSO model for different lambdas

cvfit <- cv.glmnet( A[,-1] , A[,1] , standardize = TRUE , type.measure = "mse" , nfolds = 5 , alpha = 0.9) 
cvfit    
cvfit$lambda.min
coef(cvfit , s = "lambda.min") 

which results (among others) in the following

> coef(cvfit , s = "lambda.min")
15 x 1 sparse Matrix of class "dgCMatrix"
                        1
(Intercept) -4.455556e+02
X2           .           
X3           2.869015e-05
X4           2.325949e-10
X5           .           
X6           5.955569e+00
X7           .           
X8           1.043362e+01
X9           .           
X10          3.313007e-01
X11          .           
X12          .           
X13          .           
X14          2.129794e-01
X15          .     

In glmnet(A[,-1] , A[,1] , ...) statement A[,-1] symbolize all the explanatory X variables and A[,1]the response Y variable. I want to create a loop that calculates and displays the same results as the above for all possible variables combination (e.g. first column as the response variable and all the others as the explanatory ones, second column as the response variable and all the others as the explanatory ones). Using the for statement I manage to create the following, which doesn't seems to work. Can someone help me figure it out?

library(readxl)
data <-read_excel("example.xlsx")
data


library(glmnet)
A <- as.matrix(data)
for(i in 1:ncol(data)) fit_lasso[i] <- glmnet(A[,-i] , A[,i] , standardize = TRUE , alpha = 0.9)


for(i in 1:ncol(data)) cvfit[i] <- cv.glmnet( A[,-i] , A[,i] , standardize = TRUE , type.measure = "mse" , nfolds = 5 , alpha = 0.9) 


cvfit$lambda.min

coef(cvfit[i] , s = "lambda.min") 

Upvotes: 1

Views: 627

Answers (1)

ekoam
ekoam

Reputation: 8844

Try this instead:

results <- lapply(seq_len(ncol(A)), function(i) {
  list(
    fit_lasso = glmnet(A[, -i], A[, i], standardize = T, alpha = 0.9), 
    cvfit = cv.glmnet(A[, -i] , A[, i] , standardize = TRUE , type.measure = "mse" , nfolds = 5 , alpha = 0.9)
  )
})

To get one set of results:

# Must use "[[" and "]]" here. 
results[[3L]]$cvfit$lambda.min
coef(results[[3L]]$cvfit, s = "lambda.min") 

Output

> results[[3L]]$cvfit$lambda.min
[1] 1.542775
> coef(results[[3L]]$cvfit, s = "lambda.min") 
11 x 1 sparse Matrix of class "dgCMatrix"
                      1
(Intercept)  52.7322579
mpg           .        
cyl          15.1087471
hp            0.5848973
drat          .        
wt           72.9452152
qsec         -9.1803140
vs          -11.6195183
am            .        
gear          .        
carb        -23.8347410

Upvotes: 1

Related Questions