Reputation: 21
I'm trying different methods to classify a binary problem.
I'm using the command "predict" for basically every one, and confusionMatrix from the caret package to assess results, but I just can't find a way to specify the best cutoff threshold (which I've already found using roc and extracting the coords).
For example: I know my best cutoff is 0.77, but I can't find a way to use it in the predict function, which uses 0.5 by default.
Is there a way to do it?
Thank you
Upvotes: 1
Views: 8161
Reputation: 9525
Here something you can try if I've understood well:
# a model with a famous dataset
model <- glm(formula= vs ~ wt + disp, data=mtcars, family=binomial)
# let's predict the same data: use type response to have probability as resulthere you decide the cutoff and put as factor, in one line
pred_ <- as.factor(ifelse(predict(model, mtcars, type="response")>0.7,"1","0"))
# here we go!
confusionMatrix(pred_, as.factor(mtcars$vs))
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 16 3
1 2 11
Accuracy : 0.8438
95% CI : (0.6721, 0.9472)
No Information Rate : 0.5625
P-Value [Acc > NIR] : 0.000738
Kappa : 0.68
Mcnemar's Test P-Value : 1.000000
Sensitivity : 0.8889
Specificity : 0.7857
Pos Pred Value : 0.8421
Neg Pred Value : 0.8462
Prevalence : 0.5625
Detection Rate : 0.5000
Detection Prevalence : 0.5938
Balanced Accuracy : 0.8373
'Positive' Class : 0
Upvotes: 5