Carven
Carven

Reputation: 15660

How to fix y-intercept value in linear regression?

I'm trying to fit a least square line across my data using scipy's linregress() with something like this:

from scipy import stats
import numpy as np

y = [30, 60, 19, 28, 41, 49, 62, 75, 81]
x = np.arange(0,9)

grad, intercept, r_value, p_value, std_err = stats.linregress(x,y)

However, I would also like to fix the y-intercept at a particular point.

Ideally, I'm planning to fix it at the first value in y list. In other words, what I'm really trying to do is I want the best fit line to pass through the first value in the y list, which is 30 in my example.

But it seems like Scipy is deciding the y-intercept for me.

How can I fix the y-intercept to a particular value in scipy's linear regression method?

PS: I've also tried using statsmodels' OLS, but it only allows me to either stay at y-intercept=0 or let it decides the best intercept for me.

Upvotes: 1

Views: 2712

Answers (2)

James Phillips
James Phillips

Reputation: 4657

In a polynomial equation such as a parabola:

Y = a + bX + cX^2

when X = 0, then Y = a and so if you are fitting a polynomial and can use a fixed value for a, you can make the Y intercept equal to any value you wish. Again using the example of a parabola, if you fit data to the equation:

Y = 7.5 + bX + cX^2

then the fitted intercept for the above equation will be 7.5.

Upvotes: 0

Josef
Josef

Reputation: 22897

In statsmodels you can shift y so the origin is at zero and exclude the intercept:

res = OLS(y - 30., x).fit()

where x contains the regressors without intercept (column of ones). Then the interpretation is that we predict the deviation from 30.

y_predicted = 30 + res.predict(...)

Almost all statistic, like bse, tvalues, pvalues and fit statistics like rsquared, are independent of the shift in location assuming the constant is fixed at the shift value.

Upvotes: 2

Related Questions