Nico Piace
Nico Piace

Reputation: 21

Set cutoff threshold when predicting in R

I'm trying different methods to classify a binary problem.

I'm using the command "predict" for basically every one, and confusionMatrix from the caret package to assess results, but I just can't find a way to specify the best cutoff threshold (which I've already found using roc and extracting the coords).

For example: I know my best cutoff is 0.77, but I can't find a way to use it in the predict function, which uses 0.5 by default.

Is there a way to do it?

Thank you

Upvotes: 1

Views: 8161

Answers (1)

s__
s__

Reputation: 9525

Here something you can try if I've understood well:

# a model with a famous dataset
model <- glm(formula= vs ~ wt + disp, data=mtcars, family=binomial)

# let's predict the same data: use type response to have probability as resulthere you decide the cutoff and put as factor, in one line
pred_ <- as.factor(ifelse(predict(model, mtcars, type="response")>0.7,"1","0"))

# here we go!
confusionMatrix(pred_, as.factor(mtcars$vs))

    Confusion Matrix and Statistics

          Reference
Prediction  0  1
         0 16  3
         1  2 11

               Accuracy : 0.8438          
                 95% CI : (0.6721, 0.9472)
    No Information Rate : 0.5625          
    P-Value [Acc > NIR] : 0.000738        

                  Kappa : 0.68            

 Mcnemar's Test P-Value : 1.000000        

            Sensitivity : 0.8889          
            Specificity : 0.7857          
         Pos Pred Value : 0.8421          
         Neg Pred Value : 0.8462          
             Prevalence : 0.5625          
         Detection Rate : 0.5000          
   Detection Prevalence : 0.5938          
      Balanced Accuracy : 0.8373          

       'Positive' Class : 0 

Upvotes: 5

Related Questions