ggupta
ggupta

Reputation: 707

Use existing coefficient and intercept in Linear regression

I'm using scikit-learn module for Linear Regression. My model runs every day, now i store the model's variable (coef_ & intercept_) to file, so that it can be used again when my model run.

Let's suppose, on a history of one year, i'm running the model daily. On 25th of November, i saved the model's coef_ & intercept_ in a file, so again i restart my program, and it will start from 25th of November, and run till the last

So when i compare the predictions for 26th November before and after the restarting, predictions are different. So i just thought of using the coef_ & intercept_ before restarting, so that after restart, it should predict the same for 26th November.

To do this, i just overwrite the coef_ & intercept_

from sklearn import linear_model

model = linear_model.LinearRegression()
model.coef_ = coef_stored
model.intercept_ = intercept_stored

model.fit(X, y)
model.predict(x)

I want my predictions for 26th to be same, before and after the restart. Using above code i was not able to achieve it.

Upvotes: 1

Views: 2321

Answers (1)

seralouk
seralouk

Reputation: 33147

Altering the attributes of an untrained model is not recommended, but following desertnaut's comment you can do it as shown in How to instantiate a Scikit-Learn linear model with known coefficients without fitting it.

However, if you call the fit method, then the coefficient and intercept will be overwritten.

from sklearn.linear_model import LinearRegression
import numpy as np
np.random.seed(0)

my_intercepts = np.ones(2)
my_coefficients = np.random.randn(2, 3)

new_model = LinearRegression()
new_model.intercept_ = my_intercepts
new_model.coef_ = my_coefficients

print(new_model.coef_)
#[[ 1.76405235  0.40015721  0.97873798]
# [ 2.2408932   1.86755799 -0.97727788]]


new_model.predict(np.random.randn(5, 3))
#array([[ 2.51441481,  2.94725181],
#       [ 3.20531004,  0.76788778],
#       [ 2.82562532,  2.49886169],
#       [ 1.98568931,  4.73850448],
#       [-1.28821286,  2.60145844]])

You said:

So i just thought of using the coef_ & intercept_ before restarting, so that after restart, it should predict the same for 26th November.

If you do not get the same results when you are sure that you use the same data and model coefficients, then something is wrong. Small differences can be expected across sklearn versions in case you upgraded sklearn between the before and after states of your program.

Upvotes: 1

Related Questions