Reputation: 31
I used sklearn.svm.SVC to build the support vectors classifier as shown below.
import numpy as np
from sklearn.svm import SVC
svc=SVC(probability=True)
X = np.random.randint(0, 100, [100, 3])
y = np.random.choice([0, 1, 2], 100, replace=True)
svc.fit(X, y)
print(svc.predict([[10, 20, 30]]), svc.predict_proba([[10, 20, 30]]))
The outputs are
[2] [[0.38993057 0.3791583 0.23091113]]
The result of svc.predict_proba() shows that the instance should belong to class0 with the highest probability. But svc.predict() says class2 instead. I wonder why these two results are inconsistent.
Upvotes: 2
Views: 1130
Reputation: 589
scikit-learn documentation explicitly mentions that if you use svm.SVC(probability=True)
, then the output class predicted from .predict() may be different from .predict_proba()
. The reason for that is that it uses 5-fold cross-validation which is stochastic in nature.
Upvotes: 3