Reputation: 871
I am training my dataset using linearsvm in scikit. Can I calculate/get the probability with which a sample is classified under a given label?
For example, using SGDClassifier(loss="log")
to fit the data, enables the predict_proba method, which gives a vector of probability estimates P(y|x)
per sample x
:
>>> clf = SGDClassifier(loss="log").fit(X, y)
>>> clf.predict_proba([[1., 1.]])
Output:
array([[ 0.0000005, 0.9999995]])
Is there any similar function which I can use to calculate the prediction probability while using svm.LinearSVC
(multi-class classification). I know there is a method decision_function
to predict the confidence scores for samples in this case. But, is there any way I can calculate probability estimates for the samples using this decision function?
Upvotes: 1
Views: 2089
Reputation: 363517
No, LinearSVC
will not compute probabilities because it's not trained to do so. Use sklearn.linear_model.LogisticRegression
, which uses the same algorithm as LinearSVC
but with the log loss. It uses the standard logistic function for probability estimates:
1. / (1 + exp(-decision_function(X)))
(For the same reason, SGDClassifier
will only output probabilities when loss="log"
, not using its default loss function which causes it to learn a linear SVM.)
Upvotes: 2
Reputation: 48307
Multi class classification is a one-vs-all classification. For a SGDClassifier
, as a distance to hyperplane corresponding to to particular class is returned, probability is calculated as
clip(decision_function(X), -1, 1) + 1) / 2
Refer to code for details.
You can implement similar function, it seems being reasonable to me for LinearSVC, althrough that probably needs some justification. Refer to paper mentioned in docs
Zadrozny and Elkan, “Transforming classifier scores into multiclass probability estimates”, SIGKDD‘02, http://www.research.ibm.com/people/z/zadrozny/kdd2002-Transf.pdf
P.s. A comment from "Is there 'predict_proba' for LinearSVC?":
if you want probabilities, you should either use Logistic regression or SVC. both can predict probsbilities, but in very diferent ways.
Upvotes: 1