Support Vector Machine Geometrical Intuition

Question

enter image description here

Hi, I have a big difficult trying to understand why in the equation of the hyperplane of support vector machine there is a 1 after >=?? w.x + b >= 1 <==(why this 1??) I know that could be something about the intersection point on y axes but I cannot relate that to the support vector and to its meaning of classification. Can anyone please explain me why the equation has that 1(-1) ?

Thank you.

lejlot · Accepted Answer

The 1 is just an algebraic simplification, which comes in handy in the later optimization.

First, notice, that all three hyperplanes can be denotes as

w'x+b= 0
w'x+b=+A
w'x+b=-A

If we would fix the norm of the normal w, ||w||=1, then the above would have one solution with some arbitrary A depending on the data, lets call our solution v and c (values of optimal w and b respectively). But if we let w to have any norm, then we can easily see, that if we put

w'x+b= 0
w'x+b=+1
w'x+b=-1

then there is one unique w which satisfies these equations, and it is given by w=v/A, b=c/A, because

(v/A)'x+(b/A)= 0 (when v'x+b=0) // for the middle hyperplane
(v/A)'x+(b/A)=+1 (when v'x+b=+A) // for the positive hyperplane
(v/A)'x+(b/A)=-1 (when v'x+b=-A) // for the negative hyperplane

In other words - we assume that these "supporting vectors" satisfy w'x+b=+/-1 equation for future simplification, and we can do it, because for any solution satisfing v'x+c=+/-A there is a solution for our equation (with different norm of w)

So once we have these simplifications our optimization problem simplifies to the minimization of the norm of ||w|| (maximization of the size of the margin, which now can be expressed as `2/||w||). If we would stay with the "normal" equation with (not fixed!) A value, then the maximization of the margin would be in one more "dimension" - we would have to look through w,b,A to find the triple which maximizes it (as the "restrictions" would be in the form of y(w'x+b)>A). Now, we just search through w and b (and in the dual formulation - just through alpha but this is the whole new story).

This step is not required. You can build SVM without it, but this makes thing simplier - the Ockham's razor rule.

Support Vector Machine Geometrical Intuition

Answers (2)

Related Questions