jtlz2
jtlz2

Reputation: 8417

scikit-learn: How do I define the thresholds for the ROC curve?

When plotting the ROC (or deriving the AUC) in scikit-learn, how can one specify arbitrary thresholds for roc_curve, rather than having the function calculate them internally and return them?

from sklearn.metrics import roc_curve
fpr,tpr,thresholds = roc_curve(y_true,y_pred)

A related question was asked at Scikit - How to define thresholds for plotting roc curve, but the OP's accepted answer indicates that their intent was different to how it was written.

Thanks!

Upvotes: 2

Views: 6161

Answers (2)

Arthur G.
Arthur G.

Reputation: 93

It's quite simple. ROC curve shows you outputs for different thresholds. You always choose best threshold for you model to get forecasts, but ROC curve shows you how robust/good your model is for different thresholds. Here you have quite good explanation how it works: https://www.dataschool.io/roc-curves-and-auc-explained/

Upvotes: 0

Matthieu Brucher
Matthieu Brucher

Reputation: 22023

What you get from the classifier are scores, not just a class prediction.

roc_curve will give you a set of thresholds with associated false positive rates and true positive rates.

If you want your own threshold, just use it:

y_class = y_pred > threshold

Then you can display a confusion matrix, with this new y_class compared to y_true.

And if you want several thresholds, do the same, and get the confusion matrix from each of them to get the true and false positive rate.

Upvotes: 3

Related Questions