AAKASH RAJU Udasi
AAKASH RAJU Udasi

Reputation: 79

Sklearn linear regression loss function not matching with manual code

I have been trying to replicate the result of cost as per Sklearn linear regression library with the manual code. There is a huge difference between the two and i am not able to figure out why. Here is the code from Sklearn:

SkLearn implementation:

X_train, X_test, Y_train, Y_test = model_selection.train_test_split(X, Y, test_size=0.30)
classifier = sklearn.linear_model.LinearRegression()
classifier.fit(X_train,Y_train)
cost = np.sqrt(np.sum((np.dot(X_train,classifier.coef_.reshape(9,1)) + classifier.intercept_ - Y_train.reshape(478,1))**2))
print(cost)

cost = 4.236441942240197

My attempt of replicating the result:

a = X_train_rev.shape
assert(X_train_rev.shape == (478,10)) # assert shape of the X_train_rev
Y_train = Y_train.reshape(478,1)

alpha = 0.0005 # Learning_Rate
coefficient = np.random.randn(1,10) # Initialisation of coefficients including intercept

# Loop through iterations
for i in range(100000):
    cost = np.sqrt(np.sum((np.dot(X_train_rev,coefficient.T) - Y_train)**2)) # cost result
    if i % 10000 == 0: print(cost)
    grad = np.dot((np.dot(X_train_rev,coefficient.T) - Y_train).T, X_train_rev) # Compute Gradients
    coefficient = coefficient - (alpha * grad) # adjust coefficients including intercept



Cost after Iterations:
45.23042864973579
10.428401916963285
10.428401916963285
10.428401916963285
10.428401916963285
10.428401916963285
10.428401916963285
10.428401916963285
10.428401916963285
10.428401916963285

The cost, as per my manual code, is not further reducing and is much farther from the cost as per Sklearn. I tried playing with alpha variable but any increase in alpha results into cost tending towards positive infinity.

Please note that the X_train_rev data, used in my manual code, has 10 columns/features instead of 9 features in Sklearn training set because i added a column of 'ones' in the training set to represent intercept. Similarly, the coefficient vector too includes intercept.

Upvotes: 3

Views: 1527

Answers (1)

Anvar Kurmukov
Anvar Kurmukov

Reputation: 662

I have tried to replicate your problem

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

X, Y = make_regression(n_samples=500, n_features=9, bias=0, random_state=1)

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.30, random_state=1)
classifier = LinearRegression(fit_intercept=False)
classifier.fit(X_train,Y_train)
cost = np.sqrt(np.sum((np.dot(X_train,classifier.coef_.reshape(9,1)) + classifier.intercept_ - Y_train.reshape(-1,1))**2))

print(cost)
print('Manual regression')
Y_train = Y_train.reshape(-1,1)

alpha = 0.0005 # Learning_Rate
coefficient = np.random.randn(1,9) # Initialisation of coefficients including intercept

# Loop through iterations
for i in range(100000):
    cost = np.sqrt(np.sum((np.dot(X_train,coefficient.T) - Y_train)**2)) # cost result
    if i % 10000 == 0: print(cost)
    grad = np.dot((np.dot(X_train,coefficient.T) - Y_train).T, X_train) # Compute Gradients
    coefficient = coefficient - (alpha * grad) # adjust coefficients including intercept

small adjustments to make the code fully reproducible. And I faced no problems. There is a small difference in MSE, but both scores are less then 1e-11, so it is a numerical issue.

enter image description here

Upvotes: 2

Related Questions