Reputation: 125
I got a convergence warning using linear support vector machine in Scikit learn with breast cancer data.
Below is the code:
from sklearn.svm import LinearSVC
from sklearn.datasets import load_breast_cancer
cancer = load_breast_cancer()
(X_cancer, y_cancer) = load_breast_cancer(return_X_y = True)
X_train, X_test, y_train, y_test = train_test_split(X_cancer, y_cancer, random_state = 0)
clf = LinearSVC(max_iter=700000).fit(X_train, y_train)
print('Breast cancer dataset')
print('Accuracy of Linear SVC classifier on training set: {:.2f}'
.format(clf.score(X_train, y_train)))
print('Accuracy of Linear SVC classifier on test set: {:.2f}'
.format(clf.score(X_test, y_test)))
Even with ultra high number of iterations, I still got Convergence Warning:
ConvergenceWarning: Liblinear failed to converge, increase the number of iterations. warnings.warn("Liblinear failed to converge, increase "
Can any one explain why it cannot converge? And, generally, can I just ignore Convergence warning, or do I need to further tune the model?
Thank you very much!
Upvotes: 1
Views: 2738
Reputation: 46898
svm methods are distance based, and your columns are on different scales. so it makes sense to scale the data first before fitting the model. See more at post such as this or this
So if we do it again with scaling :
from sklearn.svm import LinearSVC
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
cancer = load_breast_cancer()
(X_cancer, y_cancer) = load_breast_cancer(return_X_y = True)
X_cancer = StandardScaler().fit_transform(X_cancer)
X_train, X_test, y_train, y_test = train_test_split(X_cancer, y_cancer, random_state = 0)
clf = LinearSVC().fit(X_train, y_train)
You get a pretty good accuracy without the convergence issue:
print('Accuracy of Linear SVC classifier on training set: {:.2f}'
.format(clf.score(X_train, y_train)))
print('Accuracy of Linear SVC classifier on test set: {:.2f}'
.format(clf.score(X_test, y_test)))
Accuracy of Linear SVC classifier on training set: 0.99
Accuracy of Linear SVC classifier on test set: 0.94
Upvotes: 2