DN1
DN1

Reputation: 218

How to add L1 normalization in python?

I am trying to code logistic regression from scratch. In this code I have, I thought my cost derivative was my regularization, but I've been tasked with adding L1norm regularization. How do you add this in python? Should this be added where I have defined the cost derivative? Any help in the right direction is appreciated.

def Sigmoid(z):
    return 1/(1 + np.exp(-z))

def Hypothesis(theta, X):   
    return Sigmoid(X @ theta)

def Cost_Function(X,Y,theta,m):
    hi = Hypothesis(theta, X)
    _y = Y.reshape(-1, 1)
    J = 1/float(m) * np.sum(-_y * np.log(hi) - (1-_y) * np.log(1-hi))
    return J

def Cost_Function_Derivative(X,Y,theta,m,alpha):
    hi = Hypothesis(theta,X)
    _y = Y.reshape(-1, 1)
    J = alpha/float(m) * X.T @ (hi - _y)
    return J

def Gradient_Descent(X,Y,theta,m,alpha):
    new_theta = theta - Cost_Function_Derivative(X,Y,theta,m,alpha)
    return new_theta

def Accuracy(theta):
    correct = 0
    length = len(X_test)
    prediction = (Hypothesis(theta, X_test) > 0.5) 
    _y = Y_test.reshape(-1, 1)
    correct = prediction == _y
    my_accuracy = (np.sum(correct) / length)*100
    print ('LR Accuracy: ', my_accuracy, "%")

def Logistic_Regression(X,Y,alpha,theta,num_iters):
    m = len(Y)
    for x in range(num_iters):
        new_theta = Gradient_Descent(X,Y,theta,m,alpha)
        theta = new_theta
        if x % 100 == 0:
            print #('theta: ', theta)    
            print #('cost: ', Cost_Function(X,Y,theta,m))
    Accuracy(theta)
ep = .012 
initial_theta = np.random.rand(X_train.shape[1],1) * 2 * ep - ep
alpha = 0.5
iterations = 10000
Logistic_Regression(X_train,Y_train,alpha,initial_theta,iterations)

Upvotes: 4

Views: 1635

Answers (2)

JeeyCi
JeeyCi

Reputation: 579

either marked answer or code itself behaves strange when check:

import numpy as np
import pandas as pd
from scipy.special import expit

##e=0.2

def Sigmoid(z):
    return expit(-z)

def Hypothesis(theta, X):
    return Sigmoid(X @ theta)

def Cost_Function(X,Y,theta,m):
    hi = Hypothesis(theta, X)
    _y = Y.reshape(-1, 1)
    J = 1/m * np.sum(-_y * np.log(hi) - (1-_y) * np.log(1-hi))
##    J = J + e * np.sum(abs(theta))
    return J

def Cost_Function_Derivative(X,Y,theta,m):
    h = Hypothesis(theta,X)
    _y = Y.reshape(-1, 1)
    J = 1/m * X.T @ (h - _y)
##    J = J + alpha * e * (theta >= 0).astype(float)
    return J

def Gradient_Descent(X,Y,theta,m,alpha):
    new_theta = theta - alpha * Cost_Function_Derivative(X,Y,theta,m)
    return new_theta

def Accuracy(theta):
    correct = 0
    length = len(X_test)
    prediction = (Hypothesis(theta, X_test) > 0.5)
    _y = y_test.reshape(-1, 1)
    correct = prediction == _y
    my_accuracy = (np.sum(correct) / length)*100
    print ('hand-maded LR Accuracy: ', my_accuracy, "%")

def Logistic_Regression(X,Y,alpha,theta,num_iters):
    m = len(Y)
    for x in range(num_iters):
        new_theta = Gradient_Descent(X,Y,theta,m,alpha)
        # update
        theta = new_theta
        if x % 100 == 0:
            print #('theta: ', theta)
            print #('cost: ', Cost_Function(X,Y,theta,m))

    print(Accuracy(theta))


ep = .02

########## sklearn
from sklearn.datasets import make_blobs
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score

X, y =  make_blobs(1000, n_features=2, centers=2, random_state=0)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.33, random_state=42
)
# fit(X_train)
sc = StandardScaler()
sc.fit(X)
X = sc.transform(X)

from sklearn.linear_model import LogisticRegression
model_lr = LogisticRegression( C=ep, penalty="l1", tol=0.01, solver="saga", random_state=10)
model_lr.fit(X_train, y_train)
# predict(X_test)
y_pred_lr = model_lr.predict(X_test)
print("sklearn Accuracy Score: ", accuracy_score(y_pred_lr, y_test)*100)

########### hand-made

initial_theta = np.random.rand(X_train.shape[1],1) 
alpha = 0.2
iterations = 10000
Logistic_Regression(X_train,y_train,alpha,initial_theta,iterations)

# sklearn Accuracy Score:  95.45454545454545
# hand-maded LR Accuracy:  50.60606060606061 %

this implementation gives more comparable results: e.g.

Accuracy on test set by model at link   :   94.01197604790418
Accuracy on test set by sklearn model   :   95.20958083832335

p.s. backpropagation algorithm

Upvotes: 0

Gerges
Gerges

Reputation: 6499

Regularization adds a term to the cost function so that there is a compromise between minimize cost and minimizing the model parameters to reduce overfitting. You can control how much compromise you would like by adding a scalar e for the regularization term.

So just add the L1 norm of theta to the original cost function:

J = J + e * np.sum(abs(theta))

Since this term is added to the cost function, then it should be considered when computing the gradient of the cost function.

This is simple since the derivative of the sum is the sum of derivatives. So now just need to figure out what is the derivate of the term sum(abs(theta)). Since it is a linear term, then the derivative is constant. It is = 1 if theta >= 0, and -1 if theta < 0 (note there is a mathematical undeterminity at 0, but we don't care about it).

So in the function Cost_Function_Derivative we add:

J = J + alpha * e * (theta >= 0).astype(float)

Upvotes: 4

Related Questions