Reputation: 157
Hi, I have a big difficult trying to understand why in the equation of the hyperplane of support vector machine there is a 1 after >=?? w.x + b >= 1 <==(why this 1??) I know that could be something about the intersection point on y axes but I cannot relate that to the support vector and to its meaning of classification. Can anyone please explain me why the equation has that 1(-1) ?
Thank you.
Upvotes: 1
Views: 1180
Reputation: 66825
The 1
is just an algebraic simplification, which comes in handy in the later optimization.
First, notice, that all three hyperplanes can be denotes as
w'x+b= 0
w'x+b=+A
w'x+b=-A
If we would fix the norm of the normal w
, ||w||=1
, then the above would have one solution with some arbitrary A
depending on the data, lets call our solution v
and c
(values of optimal w
and b
respectively). But if we let w
to have any norm, then we can easily see, that if we put
w'x+b= 0
w'x+b=+1
w'x+b=-1
then there is one unique w
which satisfies these equations, and it is given by w=v/A
, b=c/A
, because
(v/A)'x+(b/A)= 0 (when v'x+b=0) // for the middle hyperplane
(v/A)'x+(b/A)=+1 (when v'x+b=+A) // for the positive hyperplane
(v/A)'x+(b/A)=-1 (when v'x+b=-A) // for the negative hyperplane
In other words - we assume that these "supporting vectors" satisfy w'x+b=+/-1
equation for future simplification, and we can do it, because for any solution satisfing v'x+c=+/-A
there is a solution for our equation (with different norm of w
)
So once we have these simplifications our optimization problem simplifies to the minimization of the norm of ||w||
(maximization of the size of the margin, which now can be expressed as `2/||w||
). If we would stay with the "normal" equation with (not fixed!) A
value, then the maximization of the margin would be in one more "dimension" - we would have to look through w,b,A
to find the triple which maximizes it (as the "restrictions" would be in the form of y(w'x+b)>A
). Now, we just search through w
and b
(and in the dual formulation - just through alpha
but this is the whole new story).
This step is not required. You can build SVM without it, but this makes thing simplier - the Ockham's razor rule.
Upvotes: 1
Reputation: 7717
This boundary is called "margin" and must be maximized then you have to minimize ||w||. The aim of SVM is to find a hyperplane able to maximize the distances between the two groups.
However there are infinite solutions ( see figure: move the optimal hyperplane along the perpendicualr vector) and we need to fix at least the boundaries: the +1 or -1 is a common convention to avoid these infinite solutions.
Formally you have to optimize r ||w|| and we set a bounadry condition r ||w|| = 1.
Upvotes: 1