Reputation: 21
In Logistic Regression for binary classification, while using predict()
, how does the classifier decide for the class (1/0)?
Is it based on the probability threshold, if >0.5 then 1 else 0? If so, can this threshold be manually changed?
I know we get probabilities from predict_prob()
, but i was curious about predict()
function!
Upvotes: 1
Views: 1296
Reputation: 43494
Logistic Regression, like other classification models, returns a probability for each class. Being a binary predictor, it has only two classes.
From the source code, predict()
returns the class with the highest class probability.
def predict(self, X):
"""Predict class labels for samples in X.
Parameters
----------
X : {array-like, sparse matrix}, shape = [n_samples, n_features]
Samples.
Returns
-------
C : array, shape = [n_samples]
Predicted class label per sample.
"""
scores = self.decision_function(X)
if len(scores.shape) == 1:
indices = (scores > 0).astype(np.int)
else:
indices = scores.argmax(axis=1)
return self.classes_[indices]
So yes, in this case it returns the class with a probability greater than 50%, since the sum of the class probabilities = 1.
Upvotes: 1