Reputation: 1212
Here is the code that I was using.
nfiles=40
i=0
y=[1]*20
y.extend([-1]*20)
traindata =np.zeros((40,1024));
for im in glob.glob("/home/name/Desktop/database/20trainset/*.png "):
img = cv2.imread(im)
img = im2double(img)
img = rgb2gray(img)
traindata[i] =img.reshape((1,1024))
i+=1
clf = svm.SVC( kernel='rbf',C=0.05)
clf.fit(traindata,y)
print clf.support_vectors_
The i is to keep count of the number of files.
Is there anything wrong?
Upvotes: 0
Views: 105
Reputation: 66815
This means that using your parameters, optimal SVM is to do nothing (build a trivial model which simply answers one label).
cl(x) = sign( SUM_i alpha_i y_i K(x_i, x) + b ) = sign( b ) = const.
so yes, there is something wrong, as your model does not use your training data at all.
What are the reasons? In order to use RBF-SVM you need to fit two hyperparameters: gamma
and C
, in particular C=0.05
is (probably) orders of magnitude too small to work well. Also - remember to normalize your data, because images are often represented as pixel intensitiy values (0-255) while SVM works better for a normally distributed values. Gamma also has to be carefully fitted (its default value in scikit-learn is 1/no. of features, thus it is really rough guess of a good value, rarely a good one).
As an example refer to libsvm website and a plot showing relation of C and gamma values to the obtained accuracy on heart dataset
Upvotes: 2