Reputation: 7625
I am using svm light to train a model for binary classification. Using the model, I tested some examples. I was surprised to see the output of the prediction file, it contains values greater than 1 as well as less than -1. I thought the range is [-1,1]. Am I doing something wrong?
Upvotes: 1
Views: 820
Reputation: 646
It makes sense why the values are not bounded by the interval of [-1, 1] if you understand how the SVM works. An SVM tries to draw the line which separates the negative and positive data points while maximizing their distances from the line.
The values in the prediction file represent the distances of data from the SVM optimal hyperplane, where positive values are on the positive class side of the hyperplane and negative values are on the negative class side of the hyperplane. These distance can be arbitrarily large or small and are not bounded as can be seen by this image:
I've seen some SVM implementations such as Weka's implementation of Platt's SMO which normalize the values so that they are confidence values on the positive class bounded by the interval of [0, 1], but both ways work just fine for determining how confident an SVM is on a classification since a data point further from the hyperplane is more certain than one lying close to the hyperplane.
Upvotes: 3