badgvuria
badgvuria

Reputation: 151

Analyse the predicted model of libsvm in python

Two questions about using libsvm in python:

  1. How can I know if the problem is feasible or not?
  2. How can I get the primal variable (w and the offset b)?

I use a simple example considering 4 training points (depicted by *) in a 2D space:

*----*
|    |
|    |
*----*

I train the SVM with the C_SVC formulation and a linear kernel, I classify the 4 points in two labels [-1, +1].

For example, when I set the training points like this, it should find a separating hyperplane.

{-1}----{+1}
 |       |
 |       |
{-1}----{+1}

But with this nonlinear problem, it should not been able to find a separating hyperplane (because of the linear kernel).

{+1}----{-1}
 |       |
 |       |
{-1}----{+1}

And I would like to be able to detect this case.


Sample code for the 2nd example:

from svmutil import *
import numpy as np

y = [1, -1, 1, -1]
x = [{1:-1, 2 :1}, {1:-1, 2:-1}, {1:1, 2:-1}, {1:1, 2:1}]

prob  = svm_problem(y, x)
param = svm_parameter()
param.kernel_type = LINEAR
param.C = 10

m = svm_train(prob, param)

Sample output:

optimization finished, #iter = 21
nu = 1.000000
obj = -40.000000, rho = 0.000000
nSV = 4, nBSV = 4
Total nSV = 4

Upvotes: 1

Views: 803

Answers (1)

ogrisel
ogrisel

Reputation: 40169

Run cross validation for a exponential grid of C as explained in the libsvm guide on a linear kernel SVM. If the training set accuracy can never get close to 100% that means that the linear model is too biased for the data which in turn means that the linear assumption is false (the data is not linearly separable).

BTW. the testing set accuracy is the real evaluation of the generalization ability of the model but it measures the sum of the bias and variance hence cannot be used directly to measure the bias only. The difference between the training and testing sets accuracies measures the variance or overfitting of the model. More information on error analysis can be found in this blog post summarizing practical tips and tricks from the ml-class online class.

Upvotes: 2

Related Questions