Glmnet vs Caret in R: Getting error in caret but not in glmnet

Question

I am trying to fit a Lasso regression with a cross-validated lambda using glmnet and caret package. My code is,

dim(x)
# 121755    465
dim(y)
# 121755      1

### cv.glmnet
set.seed(2108)
cl <- makePSOCKcluster(detectCores()-2,outfile="")
registerDoParallel(cl)
system.time(
  las.glm <- cv.glmnet(x=x, y=y,alpha=1,type.measure="mse",parallel = TRUE,
                      nfolds = 5, lambda = seq(0.001,0.1,by = 0.001),
                      standardize=F) 
)
stopCluster(cl)

# user  system elapsed 
# 17.98 2.28   37.23 

### caret
caretctrl <- trainControl(method = "cv", number = 5)
tune <- expand.grid(alpha=1,lambda = seq(0.001,0.1,by = 0.001))

set.seed(2108)
cl <- makePSOCKcluster(detectCores()-2,outfile="")
registerDoParallel(cl)
system.time(
  las.car <- train(x=x, y=as.numeric(y),alpha=1,method="glmnet",
                   metric="RMSE", allowParallel = TRUE,
                   trControl = caretctrl, tuneGrid = tune) 
)
stopCluster(cl)

# error
Something is wrong; all the RMSE metric values are missing:
  RMSE        Rsquared        MAE     
Min.   : NA   Min.   : NA   Min.   : NA  
1st Qu.: NA   1st Qu.: NA   1st Qu.: NA  
Median : NA   Median : NA   Median : NA  
Mean   :NaN   Mean   :NaN   Mean   :NaN  
3rd Qu.: NA   3rd Qu.: NA   3rd Qu.: NA  
Max.   : NA   Max.   : NA   Max.   : NA  
NA's   :100   NA's   :100   NA's   :100  
Error: Stopping
In addition: Warning message:
In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo,  :
  There were missing values in resampled performance measures.
Timing stopped at: 3.97 1.37 127.9

I understand that this might be due to not having enough data in one of the resamples, but I doubt that should be an issue with my data size and just 5 folds. I have tried the following solutions that didn't work for me:

Insert vectors and not a formula
Use allowParallel when CPU is not multithreaded
There are no missing data

I reckon that caret is performing some other resampling which the glmnet is not performing leading to the error. Can someone shed any light on this problem?

Edit 1

x is a semi-sparse matrix of 210 indicator and 255 continuous variables.

Glmnet vs Caret in R: Getting error in caret but not in glmnet

Answers (1)

Related Questions