KonK3
KonK3

Reputation: 81

Regression error in 2 class classification

And I am trying to do logistic regression with g1 as class 0 and g2 as class 1:

ft=np.vstack((g1,g2)) #data stacked on each other
class = np.hstack((np.zeros(f), np.ones(f))) #class values in column matrix
clc=np.reshape(cl,(2*f,1)) #class values in an array
w=np.zeros((2,1))   #weights matrix

for n in range(0,5000):
    s = np.dot(ft, w)

    prediction = (1 / (1 + np.exp(-s))) #sigmoid function

    gr = (np.dot(ft.T, class - prediction)) #gradient of loss function
    w += 0.01 * gr
print (w)

I evaluate my result using sklearn:

from sklearn.linear_model import 

And I get :

w=[[6.77812323] [2.91052504]]

coef_=[[1.22724506 1.10456893]

Do you know why the weights do not match? Is there anything wrong with my math?

Upvotes: 1

Views: 63

Answers (1)

Scratch'N'Purr
Scratch'N'Purr

Reputation: 10429

You're just missing the step where you calculate the gradient average. Also note that instead of using class, I'm using clc:

N = len(ft)    
for _ in range(5000):
    s = np.dot(ft, w)
    prediction = 1 / (1 + np.exp(-s))  # sigmoid function

    gr = np.dot(ft.T, clc - prediction)  # gradient of loss function
    gr /= N  # calculate gradient average

    w += 0.01 * gr  # update weights

Adding entire code

import numpy as np
from sklearn.linear_model import LogisticRegression

np.random.seed(1)  # for reproducibility

f = 100
mean1 = [-5, -3]
cov1 = [[5, 0], [0, 3]]
mean2 = [4, 3]
cov2 = [[3, 0], [0, 2]]
g1 = np.random.multivariate_normal(mean1, cov1, f)
g2 = np.random.multivariate_normal(mean2, cov2, f)

ft = np.vstack((g1, g2))  # data stacked on each other
cls = np.hstack((np.zeros(f), np.ones(f)))  # class values in column matrix
clc = np.reshape(cls,(2 * f, 1))  # class values in an array
w = np.zeros((2, 1))  # weights matrix

N = len(ft)
for _ in range(5000):
    s = np.dot(ft, w)
    prediction = 1 / (1 + np.exp(-s))  # sigmoid function

    gr = np.dot(ft.T, clc - prediction)  # gradient of loss function
    gr /= N  # calculate gradient average

    w += 0.01 * gr  # update weights

print("w = {}".format(w))

lr = LogisticRegression(fit_intercept=False)
lr.fit(ft, cls)
print("lr.coef_ = {}".format(lr.coef_))

Output

w = [[1.28459432]
 [1.07186532]]
lr.coef_ = [[1.23311932 1.0363586 ]]

Upvotes: 2

Related Questions