How does caret calculate sensitivity and specificity in resamples?

Question

Recently, when I use caret package to run my model, I found sensitivity and specificity from resample of its train object are different from those calculated manually for each fold. Let me use the GermanCredit data as an example.

library(caret)
data("GermanCredit")
form = as.formula('credit_risk~amount+savings+installment_rate+age+housing+number_credits')
train.control <- trainControl(method="cv", 
                           number=5,
                           summaryFunction = twoClassSummary,
                           classProbs = TRUE,
                           savePredictions='all')
rf = train(form, data=GermanCredit,  method = 'rf',
           metric = 'ROC', trControl=train.control)

print(rf$resample)

We got:

ROC         Sens        Spec        Resample
0.6239881   0.9428571   0.13333333  Fold1   
0.6603571   0.9714286   0.08333333  Fold2   
0.6622619   0.9642857   0.06666667  Fold5   
0.6502381   0.9928571   0.10000000  Fold4   
0.7072619   0.9714286   0.16666667  Fold3

As you can see, for fold 1, sensitivity and specificity are 0.94 and 0.13 respectively.

Now if we just take resamples from Fold1, and use confusionMatrix to calculate the metrics, we got the result below:

resamp.1 = rf$pred %>% filter(Resample=='Fold1')
cm=confusionMatrix(resamp.1$pred, resamp.1$obs)
print(cm) 

Confusion Matrix and Statistics

          Reference
Prediction good bad
      good  366 135
      bad    54  45

               Accuracy : 0.685          
                 95% CI : (0.6462, 0.722)
    No Information Rate : 0.7            
    P-Value [Acc > NIR] : 0.8018         

                  Kappa : 0.1393         
 Mcnemar's Test P-Value : 5.915e-09      

            Sensitivity : 0.8714         
            Specificity : 0.2500         
         Pos Pred Value : 0.7305         
         Neg Pred Value : 0.4545         
             Prevalence : 0.7000         
         Detection Rate : 0.6100         
   Detection Prevalence : 0.8350         
      Balanced Accuracy : 0.5607         

       'Positive' Class : good

As you can notice, sensitivity and specificity are 0.87 and 0.25 respectively. Compared with those from resamples output directly, the numbers are totally different!! Same thing happens with other folds.

Did I do something wrong? Or is caret doing something differently? Thanks.

nadizan · Accepted Answer

Please be aware that data(GermanCredit) does not have the same variables as the ones you save in form, it would help for future questions that you post a reproducible example. Also, it would help to use set.seed().

Nevertheless, the issue here is that you need to take in account of the mtry, i.e. the number of "Randomly Selected Predictors" used in the random forest model. See documentation and code here.

I adjusted the GermanCredit so that everyone can run it as is:

library(caret)
data("GermanCredit")
form = as.formula('Class~Amount+SavingsAccountBonds.100.to.500+SavingsAccountBonds.lt.100+SavingsAccountBonds.500.to.1000+
SavingsAccountBonds.lt.100+SavingsAccountBonds.gt.1000+SavingsAccountBonds.Unknown+
                  InstallmentRatePercentage+Age+Housing.ForFree+Housing.Own+Housing.Rent+NumberExistingCredits')
train.control <- trainControl(method="cv", 
                              number=5,
                              summaryFunction = twoClassSummary,
                              classProbs = TRUE,
                              savePredictions='all')

set.seed(100)
rf <- train(form, data=GermanCredit,  method = 'rf',
           metric = 'ROC', trControl=train.control)

If we check the rf we can see that the final value of mtry used in the model was mtry = 2.

> rf
Random Forest 

1000 samples
  12 predictor
   2 classes: 'Bad', 'Good' 

No pre-processing
Resampling: Cross-Validated (5 fold) 
Summary of sample sizes: 800, 800, 800, 800, 800 
Resampling results across tuning parameters:

  mtry  ROC        Sens        Spec     
   2    0.6465714  0.06333333  0.9842857
   7    0.6413214  0.31333333  0.8571429
  12    0.6358214  0.31666667  0.8385714

ROC was used to select the optimal model using the largest value.
The final value used for the model was mtry = 2.

Therefore by filtering mtry = 2 in the rf$pred you will get the expected result.

resamp.1 <- rf$pred %>% filter(Resample=='Fold1' & mtry == 2)
cm <- confusionMatrix(resamp.1$pred, resamp.1$obs)
print(cm) 
Confusion Matrix and Statistics

          Reference
Prediction Bad Good
      Bad    7    5
      Good  53  135

               Accuracy : 0.71            
                 95% CI : (0.6418, 0.7718)
    No Information Rate : 0.7             
    P-Value [Acc > NIR] : 0.4123          

                  Kappa : 0.1049          
 Mcnemar's Test P-Value : 6.769e-10       

            Sensitivity : 0.1167          
            Specificity : 0.9643          
         Pos Pred Value : 0.5833          
         Neg Pred Value : 0.7181          
             Prevalence : 0.3000          
         Detection Rate : 0.0350          
   Detection Prevalence : 0.0600          
      Balanced Accuracy : 0.5405          

       'Positive' Class : Bad  

 cm$byClass[1:2] == rf$resample[1,2:3]
  Sens Spec
  TRUE TRUE

EDIT:

You can also control this by checking rf$resampledCM, and see the number of observations in the different cells for different mtry and folds.

How does caret calculate sensitivity and specificity in resamples?

Answers (1)

Related Questions