Reputation: 143
How to calculate the loss of L1
and L2
regularization where w
is a vector of weights of the linear model in Python?
The regularizes shall compute the loss without considering the bias term in the weights
def l1_reg(w):
# TO-DO: Add your code here
return None
def l2_reg(w):
# TO-DO: Add your code here
return None
Upvotes: 0
Views: 14234
Reputation: 1
def calculateL1(self, vector):
vector = np.abs(vector)
return np.sum(vector)
def calculateL2(self, vector):
return np.dot(vector, vector.T)
Upvotes: 0
Reputation: 1151
While train your model you would like to get a higher accuracy as possible .therefore, you might choose all correlated features [columns,
predictors,vectors] , but, in case of the dataset you have not big enough (i.e. number of features, n
much larger than m
) , this causes what's called by overfitting .Overfitting describe that your model performs very well in a training set, but fail in the test set (i.e. training accuracy is much better compared with the test set accuracy), you can think of it, that you can solve a problem, that you have been solved before, but can't solve a similar problem, because you overthinking [Not same problem but similar],so here regularization come to solve this problem.
Let's frist explain the logic term behied Regularization.
Regularization the process of adding information [You can think of it, before giving you another problem, i add more information to first one, you categorized it, so you just not overthinking if you find similar problem].
This image show overfitted model and acurate model.
L1 & L2 are the types of information added to your model equation
In L1 you add information to model equation to be the absolute sum of theta vector (θ) multiply by the regularization parameter (λ) which could be any large number over size of data (m), where (n) is the number of features.
In L2, you add the information to model equation to be the sum of vector (θ) squared multiplied by the regularization parameter (λ) which can be any big number over size of data (m), which (n) is a number of features.
Then L2 Regularization going to be (n+1)x(n+1) diagonal matrix with a zero in the upper left and ones down the other diagonal entries multiply by the regularization parameter(λ).
Upvotes: 2
Reputation: 119
I think it is important to clarify this before answering: the L1 and L2 regularization terms aren't loss functions. They help to control the weights in the vector so that they don't become too large and can reduce overfitting.
L1 regularization term is the sum of absolute values of each element. For a length N vector, it would be |w[1]| + |w[2]| + ... + |w[N]|.
L2 regularization term is the sum of squared values of each element. For a length N vector, it would be w[1]² + w[2]² + ... + w[N]²
. I hope this helps!
Upvotes: 0