Reputation: 1
I am trying to run the xgb.cv algorithm for a multiclass classification problem in R (via R Studio) and I keep getting the following error message:
• Error in slice.xgb.DMatrix(dall, unlist(folds[-k])) : std::bad_alloc
My data set includes a 4-category response variable and 12 explanatory variables. I have gone through the process of converting the response variable to a numeric data type, splitting my data into train and test groups (80/20), created a sparse matrix using 1-hot encoding, and then built my xgb.DMatrix using the following code:
TrainM <- sparse.model.matrix(Response_var ~ .-1, data = Train_data)
Train_Label <- Train_data[,1]
Train_Matrix <- xgb.DMatrix(data = as.matrix(TrainM), label = Train_Label)
TestM <- sparse.model.matrix(Response_var ~ .-1, data = Test_data)
Test_Label <- Test_data[,1]
Test_Matrix <- xgb.DMatrix(data = as.matrix(TestM), label = Test_Label)
I then set the model parameters as follows:
nc <- length(unique(Train_Label))
xgb_params <- list(objective = "multi:softprob",
eta = 0.01,
gamma = 2,
eval_metric = 'AUC',
max_depth = 15,
subsample = 0.5,
colsample_bytree = 0.5,
num_class = nc,
min_child_weight = 2)
And then run the cross-validated model as:
CV_Model <- xgb.cv(params = xgb_params,
data = Train_Matrix,
nrounds = 1000,
nfold = 10,
stratified = TRUE,
print_every_n = 1,
early_stopping_rounds = 15,
maximize = FALSE,
prediction = TRUE)
Everything runs fine until I kick-off the CV model, which errors out very quickly (just as the model is initializing).
Error in slice.xgb.DMatrix(dall, unlist(folds[-k])) : std::bad_alloc
I am running this on a Windows 10 workstation using R v4.1.1 and RStudio V1.4.1106. I should note that I have been running the "same" code for several weeks now with no issue, with the only difference being the evaluation metric = 'mlogloss' instead of 'AUC'. However, as soon as I switched to 'AUC' the issue began to occur.
Any help resolving this would be very much appreciated!
Upvotes: 0
Views: 726
Reputation: 978
Hi Doug Turk and welcome to the site. This error has most likely something to do with lack of memory.
See for example here: https://en.cppreference.com/w/cpp/memory/new/bad_alloc
In windows you can verify this by opening the task manager while running the code, you should see the memory go up to 100%. Try to rerun the code with a subset of your data, to reduce memory requirements, to see if that fixes your problem
good luck.
Upvotes: 2