Reputation: 638
I am trying to use LinearSVC classifier
Update: Added imports
import nltk
from nltk.tokenize import word_tokenize
from nltk.classify.scikitlearn import SklearnClassifier
from sklearn.svm import LinearSVC, SVC
LinearSVC_classifier = SklearnClassifier(LinearSVC())
LinearSVC_classifier.train(featuresets)
But when I am trying to classify it with probabilities
LinearSVC_classifier.prob_classify(feats)
AttributeError occurs:
AttributeError:'LinearSVC' object has no attribute 'predict_proba'
I checked sklearn documentation, it tells that this function exist.
How to fix that?
Upvotes: 14
Views: 42735
Reputation: 27
Dummy solution featuring LinearSVC wrapped with CalibratedClassifierCV. According to my experience, reduces the quality of classification.
>>> from sklearn.calibration import CalibratedClassifierCV
>>> from sklearn.svm import LinearSVC
>>> from sklearn.pipeline import make_pipeline
>>> from sklearn.preprocessing import StandardScaler
>>> from sklearn.datasets import make_classification
>>> X, y = make_classification(n_features=4, random_state=0)
>>> clf = make_pipeline(CalibratedClassifierCV(LinearSVC()))
>>> clf.fit(X, y)
>>> clf.predict_proba(make_classification(n_features=4, random_state=1,n_samples=10,)[0])
array([[0.71360647, 0.28639353],
[0.26277422, 0.73722578],
[0.79450968, 0.20549032],
[0.79352606, 0.20647394],
[0.2486062 , 0.7513938 ],
[0.85705016, 0.14294984],
[0.90102124, 0.09897876],
[0.58071282, 0.41928718],
[0.31289046, 0.68710954],
[0.14466406, 0.85533594]])
Upvotes: 0
Reputation: 419
This can happen if there is a mistmatch between scikit-learn module versions between trained model and the predicted model.
Upvotes: 1
Reputation: 645
You can use _predict_proba_lr()
instead predict_proba
. Something like this:
from sklearn import svm
clf=svm.LinearSVC()
clf.fit(X_train,Y_train)
res= clf._predict_proba_lr(X_test,Y_test)
res would be a 2d array of probabilities of each classes against samples.
Upvotes: 7
Reputation: 33522
Given your question, there is no mentioning about some outside-wrapper like NLTK (except for the tag), so it's hard to grasp what you really need!
Vivek Kumar's comment applies. LinearSVC has no support for probabilities, while SVC does.
Now some additional remarks:
It seems someone observed this problem before.
Upvotes: 5
Reputation: 1136
According to sklearn documentation , the method 'predict_proba' is not defined for 'LinearSVC'
Workaround:
LinearSVC_classifier = SklearnClassifier(SVC(kernel='linear',probability=True))
Use SVC with linear kernel, with probability argument set to True. Just as explained in here .
Upvotes: 22