Andre Calais Salles
Andre Calais Salles

Reputation: 35

Non-tree model error when using xgbTree method with Caret and weights to target variable when applying the varImp function

When I create an model with the 'train' function from the Caret package to do a gradient boosting with weights, I get an error when using the 'varImp' function that says it didn't detected a tree model. But when I remove the weights it works.

Thee code below produces the error:

set.seed(123)

model_weights <- ifelse(modelo_df_sseg$FATALIDADES == 1,
                        yes = (1/table(modelo_df_sseg$FATALIDADES)[2]) * 0.5,
                        no = (1/table(modelo_df_sseg$FATALIDADES)[1]) * 0.5
                        )

model <- train(
  as.factor(FATALIDADES) ~.,
  data = modelo_df_sseg, 
  method = "xgbTree",
  trControl = trainControl("cv", number = 10),
  weights = model_weights
  )

varImp(model)

But if I don't apply weights it works.

Why varImp doens't recognizes my tree?

EDIT 04-SEP-2020

It was suggested by missuse in the comments section to use wts instead of weights. Now I get the error below:

Error in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, : formal argument 'wts' matched by multiple actual arguments

I made a small code with an R in-built dataset so you can test it yourself:

set.seed(123)

basex <- Arrests

model_weights <- ifelse(basex$released == 2,
                        yes = (1/table(basex$released)[2]) * 0.5,
                        no = (1/table(basex$released)[1]) * 0.5
                        )

y = basex$released
x = basex
tc = trainControl("cv", number = 10)

mtd = "xgbTree"
model <- train(
  x, 
  y, 
  method = mtd,
  trControl = tc, 
  wts = model_weights,
  verbose = TRUE
  )

Maybe I'm creating the weights vector wrong. But I can't find any documentation on the 'wts' parameter.

Upvotes: 2

Views: 798

Answers (1)

missuse
missuse

Reputation: 19756

The example code has several problems.

The correct way to apply weights in caret is using the weights argument to train.

I was mistaken in the comments where I recommended to use the argument wts. My error was due to the xgbTree source, specifically the line:

if (!is.null(wts))
    xgboost::setinfo(x, 'weight', wts)

which indicates wts might be the correct answer.

Lets go through the example and fix all problems

library(caret)
library(car) #for the data set
library(tidyverse) #because I like to use it

data(Arrests)
basex <- Arrests


table(basex$released) #released is the outcome class

  No  Yes 
 892 4334 

Here we see "Yes" outcome is much more frequent then "No" outcome. This will skew the predicted probabilities and favor a model which will tend to predict "Yes". One way to fix it is to give higher weight to the "No" observations. A meaningful weight for the "No" observations would be the proportion of the "Yes" class, and a meaningful weight for the "Yes" observations would be the proportion of the "No" class:

model_weights <- ifelse(basex$released == "Yes",
                        table(basex$released)[1]/nrow(basex),
                        table(basex$released)[2]/nrow(basex))

The sum of the weights is 1

head(data.frame(basex,
                weights = model_weights))
  released colour year age    sex employed citizen checks  weights
1      Yes  White 2002  21   Male      Yes     Yes      3 0.170685
2       No  Black 1999  17   Male      Yes     Yes      3 0.829315
3      Yes  White 2000  24   Male      Yes     Yes      3 0.170685
4       No  Black 2000  46   Male      Yes     Yes      1 0.829315
5      Yes  Black 1999  27 Female      Yes     Yes      1 0.170685
6      Yes  Black 1998  16 Female      Yes     Yes      0 0.170685

"Yes" is more frequent so we give it a lesser weight.

From the above it can be seen the data frame has several categorical predictors (like colour, sex...). xgbTree can not handle them so you will need to convert them to numeric prior to modeling. One way to convert categorical predictors to numeric is dummy coding. There are other ways but that is not within the scope of this answer.

To use dummy coding:

dummies <- dummyVars(released ~ ., data = basex)
x <- predict(dummies, newdata = basex)
head(x)
colour.Black colour.White year age sex.Female sex.Male employed.No employed.Yes citizen.No citizen.Yes checks
1            0            1 2002  21          0        1           0            1          0           1      3
2            1            0 1999  17          0        1           0            1          0           1      3
3            0            1 2000  24          0        1           0            1          0           1      3
4            1            0 2000  46          0        1           0            1          0           1      1
5            1            0 1999  27          1        0           0            1          0           1      1
6            1            0 1998  16          1        0           0            1          0           1      0

y <- basex$released

Now we have our weights, x and y

Since I will fit several models below I will first create the resampling folds and use them within each call to train, so they don't differ.

folds <- createFolds(basex$released, 10)

Since there is a disbalance in the class frequencies I will use twoClassSummary so we can see the sensitivity and specificity of the trained models

tc <- trainControl(method = "cv",
                   number = 10,
                   summaryFunction = twoClassSummary,
                   index = folds, #predefined folds
                   classProbs = TRUE) #needed for twoClassSummary

mtd <- "xgbTree"

model <- train(x = x, 
               y = y, 
               method = mtd,
               trControl = tc, 
               weights = model_weights,
               verbose = TRUE,
               metric = "ROC")

#no errors

model$results %>%
  filter(ROC == max(ROC))
  eta max_depth gamma colsample_bytree min_child_weight subsample nrounds       ROC      Sens     Spec       ROCSD     SensSD     SpecSD
1 0.3         1     0              0.8                1         1      50 0.7031076 0.6185944 0.693945 0.009074758 0.03516597 0.01536701

Here we see that if we use the model weights the model with the highest AUC has 0.6185944 sensitivity and 0.693945 specificity.

Without the weights

model2 <- train(x = x, 
               y = y, 
               method = mtd,
               trControl = tc, 
               verbose = TRUE,
               metric = "ROC")

#no errors

model2$results %>%
  filter(ROC == max(ROC))
  eta max_depth gamma colsample_bytree min_child_weight subsample nrounds      ROC      Sens      Spec     ROCSD     SensSD     SpecSD
1 0.3         1     0              0.8                1      0.75      50 0.701109 0.1000325 0.9713885 0.0101395 0.03343579 0.01236701

A model without the weights has sensitivity of 0.1000325 and specificity of 0.9713885.

So the meaningful weights argument fixed the model tendency to predict "Yes" all the time.

Upvotes: 6

Related Questions