Reputation: 14399
From what I understood from this article, the blue circles are the level curves and the blue dot is the optimal solution that minimizes the cost function. The yellow circle is the L2-norm constraint.
The solution that we need is the one that minimizes the cost function as much as possible and also, at the same time, is within the circle. Meaning, the solution is the tangent point between the yellow circle and the level curve.
But, my question is how this can be the solution if the W values at the tangent point don't completely minimize the cost function? Only the blue dot is the one that minimizes the cost function.
Upvotes: 0
Views: 130
Reputation: 185
Blue dot minimizes cost function, if there are no constraints. If the minimization is constrained by L2 norm, then the blue dot cannot be a solution , as it violates the constrain. Thus, the point w* is solution instead.
The reason why to use the L2 constrain is that we are trying to minimize the error on test data, not on the train data (i.e. we are not directly interested in minimizing the loss function). Simpler solutions (with smaller L2 norm) tends to overfitt less, so we expect the gap between test and train error to be smaller (which is desirable).
Upvotes: 1