Reputation: 117

SVM library suitable for online embedding

We're working on a machine learning project in which we'd like to see the influence of certain online sample embedding methods on SVMs.

In the process we've tried interfacing with Pegasos and dlib as well as designing (and attempting to write) our own SVM implementation.

dlib seems promising as it allows interfacing with user written kernels. Yet kernels don't give us the desired "online" behavior (unless that assumption is wrong).

Therefor, if you know about an SVM library which supports online embedding and custom written embedders, it would be of great help.

Just to be clear about "online".

It is crucial that the embedding process will happen online in order to avoid heavy memory usage.

We basically want to do the following within Stochastic subGradient Decent(in very general pseudo code):

w = 0 vector
for t=1:T
  i = random integer from [1,n]

  embed(sample_xi)

  // sample_xi is sent to sub gradient loss i as a parameter
  w = w - (alpha/t)*(sub_gradient(loss_i))
end

Upvotes: 1

Answers (2)

Davis King

Reputation: 4791

Maybe you want to use something like dlib's empirical kernel map. You can read it's documentation and particularly the example program for the gory details of what it does, but basically it lets you project a sample into the span of some basis set in a kernel feature space. There are even algorithms in dlib that iteratively build the basis set, which is maybe what you are asking about.

Upvotes: 0

phoxis

Reputation: 61910

I think in your case you might want to consider the Budgeted Stochastic Gradient Descent for Large-Scale SVM Training (BSGD) [1] by Wang, Crammer, Vucetic

This is because, as specified in the paper about the "Curse of Kernelization" you might want to explore this option instead what you have indicated in the pseudocode in your question.

The Shark Machine Learning Library implements BSGD. Check a quick tutorial here

Upvotes: 2

SVM library suitable for online embedding

Answers (2)

Related Questions