Reputation: 11
I want to separate some points of 2 classes in 3D by a plane and thought this should be possible using a support vector machine (SVM).
So I set up the following data file (data.txt) to be analyzed with svmlight:
-1 1:1 2:1 3:0
1 1:1 2:2 3:0
-1 1:3 2:2 3:0
1 1:3 2:3 3:0
-1 1:5 2:3 3:0
1 1:5 2:4 3:0
-1 1:7 2:4 3:0
1 1:7 2:5 3:0
-1 1:1 2:1 3:2
1 1:1 2:2 3:2
-1 1:3 2:2 3:2
1 1:3 2:3 3:2
-1 1:5 2:3 3:2
1 1:5 2:4 3:2
-1 1:7 2:4 3:2
1 1:7 2:5 3:2
execute:
./svm_learn data.txt model
Unfortunately, I do not know how to interpret the model and how the separating plane is described.
Can you help?
Upvotes: 1
Views: 167
Reputation: 66805
As a result, your model file should look like this:
SVM-light Version V6.02
0 # kernel type
3 # kernel parameter -d
1 # kernel parameter -g
1 # kernel parameter -s
1 # kernel parameter -r
empty# kernel parameter -u
3 # highest feature index
16 # number of training documents
17 # number of support vectors plus 1
0.85699902 # threshold b, each following line is a SV (starting with alpha*y)
-0.035708292619498309405923208714739 1:5 2:3 3:0 #
-0.035708292619498309405923208714739 1:1 2:1 3:2 #
0.035708292619498309405923208714739 1:1 2:2 3:0 #
-0.035708292619498309405923208714739 1:1 2:1 3:0 #
0.035708292619498309405923208714739 1:7 2:5 3:2 #
-0.035708292619498309405923208714739 1:7 2:4 3:2 #
0.035708292619498309405923208714739 1:1 2:2 3:2 #
-0.035708292619498309405923208714739 1:5 2:3 3:2 #
0.035708292619498309405923208714739 1:3 2:3 3:0 #
0.035708292619498309405923208714739 1:7 2:5 3:0 #
-0.035708292619498309405923208714739 1:7 2:4 3:0 #
0.035708292619498309405923208714739 1:3 2:3 3:2 #
-0.035708292619498309405923208714739 1:3 2:2 3:0 #
0.035708292619498309405923208714739 1:5 2:4 3:0 #
-0.035708292619498309405923208714739 1:3 2:2 3:2 #
0.035708292619498309405923208714739 1:5 2:4 3:2 #
These are all parameters of your model. First 9 values are configuration of your model (used kernel etc.).
You can read all parameters information by executing ./svm_learn --help
, in particular, kernel=0 means, that it is a linear kernel (which you wanted to use).
The model file contains the list of support vectors which are the points from the training set which are "closest" to the separating hyperplane, this is a main idea behind svm - to "support" your decision boundary only with the subset of your data. These vectors are sufficient to define a separation, by simply using a weighted mean of their values. These weights are alpha
parameters, found during the optimization process (and located also in the file). For simplicity these weights are already multiplied by the corresponding labels (+1
or -1
), to distringuish which "side" of the separating hyperplane they are from.
For linear kernel, interpretation of following support vectors is quite simple. you have 17 parameters, first, there is a b
value, a free parameter of your hyperplane, and then - 16 support vectors. To convert support vectors in form
alphay_i 1:s_i_1 2:s_i_2 3:s_i_3
simply calculate the sum
w = SUM_i alphay_i (s_i_1,s_i_2,s_i_3) = SUM_i alphay_i s_i
which results in the equation of the separating hyperplane normal (perpendicular vector). Whole equation of your separating hyperplane is now in form of:
<w,x> + b = 0
where w
is this sum, and b
is previously defined "intercept" parameter (0.85699902
in your case) and <w,x>
is a standard scalar product.
So for the linear kernel it simplifies to
w'x + b = 0
In general case, when you use some more complex kernel (like RBF or even polynomial) you cannot get the actual hyperplane parameters, as this object is in the feature space (so for a RBF case it has infinitely many dimensions). You can only have its functional form
<w,x> = SUM_i alphay_i K(s_i, x)
Upvotes: 2