Reputation:
With StompChicken's corrections (I miscomputed one dot product, ugh!) the answer appears to be yes. I have since tested the same problem using a precomputed kernel with the same correct results. If you are using libsvm StompChickens clear, organized computations are a very nice check.
Original Question: I am about to start using precomputed kernels in libSVM. I had noticed Vlad's answer to a question and I thought it would be wise to confirm that libsvm gave correct answers. I started with non-precomputed kernels, just a simple linear kernel with 2 classes and three data points in 3 dimensional space. I used the data
1 1:3 2:1 3:0
2 1:3 2:3 3:1
1 1:7 3:9
The model file generated by a call to svm-train -s 0 - t 0
contains
svm_type c_svc
kernel_type linear
nr_class 2
total_sv 3
rho -1.53951
label 1 2
nr_sv 2 1
SV
0.4126650675419768 1:3 2:1 3:0
0.03174528241667363 1:7 3:9
-0.4444103499586504 1:3 2:3 3:1
However when I compute the solution by hand that is not what I get. Does anyone know whether libsvm suffers from errors or can anyone compare notes and see whether they get the same thing libsvm does?
The coefficients a1
, a2
, a3
returned by libsvm are should be the values that make
a1 + a2 + a3 - 5*a1*a1 + 12*a1*a2 - 21*a1*a3 - 19*a2*a2/2 + 21*a2*a3 - 65*a3*a3
as large as possible with the restrictions that
a1 + a3 = a2
and each of a1
, a2
, a3
is required to lie between 0 and 1 (the default value of C).
The above model file says the answer is
a1 = .412665...
a2 = .444410...
a3 = .031745...
But one just has to substitute a2 = a1 + a3
into the big formula above and confirm both partial derivatives are zero to see if this solution is correct (since none of a1
,a2
,a3
is 0 or 1) but they are not zero.
Am I doing something wrong, or is libsvm giving bad results? (I am hoping I am doing something wrong.)
Upvotes: 7
Views: 1117
Reputation: 15971
LibSVM is a very widely used library and I highly doubt anything is drastically wrong with the code. That said, I think it's great that there are people who are paranoid enough to actually check it for correctness - well done!
The solution seems correct according to the working that I give below. What I mean by that is it satisfies the KKT conditions (15.29). It also true that the partial derivatives of the dual vanish at the solution.
Here's my working...
x1 = (3,1,0) x2 = (3,3,1) x3 = (7,0,9)
y1 = -1 y2 = 1 y3 = -1
K = [10 12 21]
[12 19 30]
[21 30 130]
L_dual = a1 + a2 + a3 -5a1^2 + 12a1a2 - 21a1a3 - (19/2)a2^2 + 30a2a3 - 65a3^2)
a1 = 0.412 a2 = 0.4444 a3 = 0.0317
Checking KKT:
y1.f(x1) = y1 * (y1*a1*K(x1,x1) + y2*a2*K(x1,x2) + y3*a3*k(x1,x3) - rho)
= rho + 10*a1 + 21*a3 - 12*a2
~= 1
(Similar for the x2 and x3)
Substituting a2 = a1 + a3 into L_dual:
L_dual = 2a1 + 2a3 - 2.5a1^2 + 2a1a3 - 44.5a3^2
dL/da1 = 2 - 5a1 + 2a3 = 0
dL/da3 = 2 + 2a1 - 89a3 = 0
Upvotes: 9