Bhaskar Dhariyal
Bhaskar Dhariyal

Reputation: 1395

How can I change the training threshold for any learning algorithm in sklearn?

I'm trying to try to train a model using sklearn, however, I want to change the decision threshold to train the model. Most of the result I find in SO are for prediction on test set.

Upvotes: 1

Views: 611

Answers (1)

desertnaut
desertnaut

Reputation: 60321

There is no threshold involved in a probabilistic classifier training (by scikit-learn or any other framework).

A threshold is necessary at inference time in order to convert the probabilistic predictions to hard labels, which in turn is necessary in order to calculate what are essentially business metrics like accuracy, precision, recall etc. But these metrics play no role at model training, where the only quantity that matters (and is minimized during model fitting) is the loss. And no threshold is involved in the computation of the loss.

In other words, hard class predictions (solely for which a threshold is required) play absolutely no role in model training, hence no threshold is involved during training whatsoever.

I kindly suggest reading the following answers of mine, for clarifying the relation between loss and accuracy (despite the titles, they are not specific to Keras, but they hold for any binary classification problem in principle):

Quoting also from the Cross Validated thread Reduce Classification Probability Threshold:

the statistical component of your exercise ends when you output a probability for each class of your new sample. Choosing a threshold beyond which you classify a new observation as 1 vs. 0 is not part of the statistics any more. It is part of the decision component.

Upvotes: 2

Related Questions