Klausos Klausos
Klausos Klausos

Reputation: 16050

GBM and Caret package: invalid number of intervals

Though I am defining that target <- factor(train$target, levels = c(0, 1)), the below-given code provides this error:

Error in cut.default(y, unique(quantile(y, probs = seq(0, 1, length = cuts))), : invalid number of intervals In addition: Warning messages: 1: In train.default(x, y, weights = w, ...) : cannnot compute class probabilities for regression

What does it mean and how to fix this?

  gbmGrid <- expand.grid(n.trees = (1:30)*10, 
                         interaction.depth = c(1, 5, 9), 
                         shrinkage = 0.1)

  fitControl <- trainControl(method = "repeatedcv", 
                             number = 5, 
                             repeats = 5, 
                             verboseIter = FALSE, 
                             returnResamp = "all",
                             classProbs = TRUE)

  target <- factor(train$target, levels = c(0, 1)) 

  gbm <- caret::train(target ~ .,
                      data = train,
                      #distribution="gaussian",
                      method = "gbm",
                      trControl = fitControl,
                      tuneGrid = gbmGrid)

  prob = predict(gbm, newdata=testing, type='prob')[,2]

Upvotes: 0

Views: 4827

Answers (1)

topepo
topepo

Reputation: 14316

First, don't do this:

 target <- factor(train$target, levels = c(0, 1)) 

You will get an warning:

At least one of the class levels are not valid R variables names; This may cause errors if class probabilities are generated because the variables names will be converted to: X0, X1

Second, you created an object called target. Using the formula method means that train will use the column called target in the data frame train and those are different data. Modify the column.

Upvotes: 1

Related Questions