user1618581
user1618581

Reputation: 51

R: SVM performance using custom kernel (user defined kernel) is not working in kernlab

I'm trying to use user defined kernel. I know that kernlab offer user defined kernel(custom kernel functions) in R. I used data spam including package kernlab. (number of variables=57 number of examples =4061)

I'm defined kernel's form,

kp=function(d,e){

as=v*d
bs=v*e
cs=as-bs
cs=as.matrix(cs)

exp(-(norm(cs,"F")^2)/2)
}

class(kp)="kernel"

It is the transformed kernel for gaussian kernel, where v is the continuously changed values that are inverse of standard deviation vector about each variables, for example:

v=(0.1666667,........0.1666667)

The training set defined 60% of spam data (preserving the proportions of the different classes).

if data's type is spam, than data's type = 1 for train svm

m=ksvm(xtrain,ytrain,type="C-svc",kernel=kp,C=10)

But this step is not working. It's always waiting for a response.

So, I ask you this problem, why? Is it because the number of examples are too big? Is there any other R package that can train SVMs for user defined kernel?

Upvotes: 5

Views: 2440

Answers (1)

lejlot
lejlot

Reputation: 66815

First, your kernel looks like a classic RBF kernel, with v = 1/sigma, so why do you use it? You can use a built-in RBF kernel and simply set the sigma parameter. In particular - instead of using frobenius norm on matrices you could use classic euclidean on the vectorized matrices.

Second - this is working just fine.

> xtrain = as.matrix( c(1,2,3,4) )
> ytrain = as.factor( c(0,0,1,1) )
> v= 0.01
> m=ksvm(xtrain,ytrain,type="C-svc",kernel=kp,C=10)
> m
Support Vector Machine object of class "ksvm" 

SV type: C-svc  (classification) 
 parameter : cost C = 10 


Number of Support Vectors : 4 

Objective Function Value : -39.952 
Training error : 0 

There are at least two reasons for you still waiting for results:

  • RBF kernels induce the most hard problem to optimize for SVM (especially for large C)
  • User defined kernels are far less efficient then builtin

As I am not sure, whether ksvm actually optimizes the user-defined kernel computation (in fact I'm pretty sure it does not), you could try to build the kernel matrix ( K[i,j] = K(x_i,x_j) where x_i is i'th training vector) and provide ksvm with it. You can achieve this by

K <- kernelMatrix(kp,xtrain)
m <- ksvm(K,ytrain,type="C-svc",kernel='matrix',C=10)

Precomputing kernel matrix can be quite long process, but then optimization itself will be much faster, so it is a good method if you want to test many different C values (which you for sure should do). Unfortunately this requires O(n^2) memory, so if you use more then 100 000 vectors, you will need really great amount of RAM.

Upvotes: 3

Related Questions