Reputation: 73
My dataset has feature columns and a target label of 0 and 1.
When I use SVM classifier for binary classification it predicts well.
But my question is how is it mathematically predicted.?
The marginal hyperplanes H1 and H2 have the equations: W^T X +b >= 1
meaning if greater than +1 it falls in one class. And if less than -1, it falls in another class.
But we have given the target label 0 and 1.
How actually is it done mathematically?
Anyone expert please.....
Upvotes: 1
Views: 785
Reputation: 393
Basically, SVM wants to find the optimal hyperplane that splits the datapoints in such way that the margin between the closest datapoints of each class (the so-called support vectors) is maximized. This all breaks down to the following Lagrangian optimization problem:
w:vector that determines the optimum hyperplane ( for intuition, make yourself familiar with the geometrical meaning of a dot product)
(w^T∙x_i+b) is a scalar and displays the geometrical distance between single datapoint x_i and the maximum margin hyperplane
b is a bias vector ( I think it comes from indetermined integral in derivation of SVM) more on that you can find here: University Stanford -Computer Science Lecture 3-SVM
λ_i the Lagrangian multiplier
y_i the normalized classification boundary
Solving the optimization problem leads to all necessary parameters of w, b, and lambda.
To answer you quesiton in one sentence: The class boundaries [-1,1] are set arbitrarily. It is really just definition.
The labels of your binary data [0;1] (so-called dummy varaibles) have nothing to with the boundaries. It is just a convenient way to label binary data. The labels are only needed to link the features to its corresponding class or category.
The only non paramter in Formula (8) is x_i , your datapoint in feature space.
At least thats how I understand SVM. Feel free to correct me if I am wrong or unprecise.
Upvotes: 1