Reputation: 1937
I have a code to try to use Non Linear SVM (RBF kernel).
raw_data1 = open("/Users/prateek/Desktop/Programs/ML/Dataset.csv")
raw_data2 = open("/Users/prateek/Desktop/Programs/ML/Result.csv")
dataset1 = np.loadtxt(raw_data1,delimiter=",")
result1 = np.loadtxt(raw_data2,delimiter=",")
clf = svm.NuSVC(kernel='rbf')
clf.fit(dataset1,result1)
However, when I try to fit, I get the error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/prateek/Desktop/Programs/ML/lib/python2.7/site-packages/sklearn/svm/base.py", line 193, in fit
fit(X, y, sample_weight, solver_type, kernel, random_seed=seed)
File "/Users/prateek/Desktop/Programs/ML/lib/python2.7/site-packages/sklearn/svm/base.py", line 251, in _dense_fit
max_iter=self.max_iter, random_seed=random_seed)
File "sklearn/svm/libsvm.pyx", line 187, in sklearn.svm.libsvm.fit (sklearn/svm/libsvm.c:2098)
ValueError: specified nu is infeasible
What is the reason for such an error?
Upvotes: 3
Views: 4458
Reputation: 2758
The nu
parameter is, as pointed out in the documentation, "An upper bound on the fraction of training errors and a lower bound of the fraction of support vectors".
So, whenever you try to fit your data and this bound cannot be satisfied, optimization problem becomes infeasible. Therefore your error.
As a matter of fact, I looped from 1.
to 0.1
(decreasing in decimal units) and still got the error, then just tried with 0.01
and no complaints arose. But of course, you should check the results of fitting your model with that value, check if accuracy is acceptable on predictions.
Update: actually I was curious and splitted your dataset to validate, output was 69% accuracy (also I think your training set might be very little)
Just for reproducibility purposes, here, the quick test I performed:
from sklearn import svm
import numpy as np
from sklearn.cross_validation import train_test_split
from sklearn.metrics import accuracy_score
raw_data1 = open("Dataset.csv")
raw_data2 = open("Result.csv")
dataset1 = np.loadtxt(raw_data1,delimiter=",")
result1 = np.loadtxt(raw_data2,delimiter=",")
clf = svm.NuSVC(kernel='rbf',nu=0.01)
X_train, X_test, y_train, y_test = train_test_split(dataset1,result1, test_size=0.25, random_state=42)
clf.fit(X_train,y_train)
y_pred = clf.predict(X_test)
accuracy_score(y_test, y_pred, normalize=True, sample_weight=None)
Upvotes: 2