worker
worker

Reputation: 31

Why is the result of sklearn.svm.SVC.predict() inconsistent with sklearn.svm.SVC.predict_proba()?

I used sklearn.svm.SVC to build the support vectors classifier as shown below.

import numpy as np
from sklearn.svm import SVC
    
svc=SVC(probability=True)

X = np.random.randint(0, 100, [100, 3])
y = np.random.choice([0, 1, 2], 100, replace=True)
svc.fit(X, y)

print(svc.predict([[10, 20, 30]]), svc.predict_proba([[10, 20, 30]]))

The outputs are

[2] [[0.38993057 0.3791583  0.23091113]]

The result of svc.predict_proba() shows that the instance should belong to class0 with the highest probability. But svc.predict() says class2 instead. I wonder why these two results are inconsistent.

Upvotes: 2

Views: 1130

Answers (1)

ATIF ADIB
ATIF ADIB

Reputation: 589

scikit-learn documentation explicitly mentions that if you use svm.SVC(probability=True), then the output class predicted from .predict() may be different from .predict_proba(). The reason for that is that it uses 5-fold cross-validation which is stochastic in nature.

Upvotes: 3

Related Questions