Dobbleri
Dobbleri

Reputation: 123

Discrepancy between Caret, Yardstick, and MLeval in R regarding precision-recall

I am attempting to plot the precision-recall curve and measure the area under the curve of a caret cross-validated train object. Simply calling the object name yields some values of what the area under the precision recall curve is, like so:

> rf

Random Forest 

807 samples
 11 predictor
  2 classes: 'X0', 'X1' 

No pre-processing
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 727, 726, 727, 726, 727, 726, ... 
Resampling results across tuning parameters:

  mtry  splitrule   AUC        Precision  Recall     F        
   2    gini        0.8179379  0.8618888  0.6713675  0.7494214
   2    extratrees  0.8061601  0.8960233  0.5725071  0.6901257
   7    gini        0.7798593  0.8775955  0.8037037  0.8360293
   7    extratrees  0.8004585  0.8587664  0.7696581  0.8090205
  12    gini        0.7659204  0.8578710  0.8229345  0.8364962
  12    extratrees  0.7840497  0.8498209  0.7925926  0.8167108

Tuning parameter 'min.node.size' was held constant at a value of 1
AUC was used to select the optimal model using the largest value.
The final values used for the model were mtry = 2, splitrule = gini and min.node.size = 1.

However, when I try to graph an actual curve using yardstick, I get completely different results.

prRf <- pr_curve(rf$pred, X0, truth = obs)

ggplot() +
  geom_path(aes(x = recall, y = precision), colour = "blue", linetype = 1, data = prRf) +
  xlab("Recall") +
  ylab("Precision") +
  theme_minimal() +
  ylim(0,1)

pr_auc(rf$pred, X0, truth = obs)

Here, the curve looks vastly "better" and the AUPR is higher compared to internal one given by caret (0.878 vs. 0.817). The same holds true for a simple run of MLeval, which gives similarly "better" results.

evalm(rf)

All of this is confusing me quite a bit, and I feel like I may be testing within-sample somehow but I am unsure how to do it correctly without splitting the data beforehand.

Upvotes: 0

Views: 44

Answers (0)

Related Questions