Davide Nava
Davide Nava

Reputation: 157

Support Vector Machine Geometrical Intuition

enter image description here

Hi, I have a big difficult trying to understand why in the equation of the hyperplane of support vector machine there is a 1 after >=?? w.x + b >= 1 <==(why this 1??) I know that could be something about the intersection point on y axes but I cannot relate that to the support vector and to its meaning of classification. Can anyone please explain me why the equation has that 1(-1) ?

Thank you.

Upvotes: 1

Views: 1180

Answers (2)

lejlot
lejlot

Reputation: 66825

The 1 is just an algebraic simplification, which comes in handy in the later optimization.

First, notice, that all three hyperplanes can be denotes as

w'x+b= 0
w'x+b=+A
w'x+b=-A

If we would fix the norm of the normal w, ||w||=1, then the above would have one solution with some arbitrary A depending on the data, lets call our solution v and c (values of optimal w and b respectively). But if we let w to have any norm, then we can easily see, that if we put

w'x+b= 0
w'x+b=+1
w'x+b=-1

then there is one unique w which satisfies these equations, and it is given by w=v/A, b=c/A, because

(v/A)'x+(b/A)= 0 (when v'x+b=0) // for the middle hyperplane
(v/A)'x+(b/A)=+1 (when v'x+b=+A) // for the positive hyperplane
(v/A)'x+(b/A)=-1 (when v'x+b=-A) // for the negative hyperplane

In other words - we assume that these "supporting vectors" satisfy w'x+b=+/-1 equation for future simplification, and we can do it, because for any solution satisfing v'x+c=+/-A there is a solution for our equation (with different norm of w)

So once we have these simplifications our optimization problem simplifies to the minimization of the norm of ||w|| (maximization of the size of the margin, which now can be expressed as `2/||w||). If we would stay with the "normal" equation with (not fixed!) A value, then the maximization of the margin would be in one more "dimension" - we would have to look through w,b,A to find the triple which maximizes it (as the "restrictions" would be in the form of y(w'x+b)>A). Now, we just search through w and b (and in the dual formulation - just through alpha but this is the whole new story).

This step is not required. You can build SVM without it, but this makes thing simplier - the Ockham's razor rule.

Upvotes: 1

venergiac
venergiac

Reputation: 7717

non optimal hyperplane

This boundary is called "margin" and must be maximized then you have to minimize ||w||. The aim of SVM is to find a hyperplane able to maximize the distances between the two groups.

However there are infinite solutions ( see figure: move the optimal hyperplane along the perpendicualr vector) and we need to fix at least the boundaries: the +1 or -1 is a common convention to avoid these infinite solutions.

Formally you have to optimize r ||w|| and we set a bounadry condition r ||w|| = 1.

Upvotes: 1

Related Questions