Reputation: 15660
I'm trying to fit a least square line across my data using scipy's linregress()
with something like this:
from scipy import stats
import numpy as np
y = [30, 60, 19, 28, 41, 49, 62, 75, 81]
x = np.arange(0,9)
grad, intercept, r_value, p_value, std_err = stats.linregress(x,y)
However, I would also like to fix the y-intercept at a particular point.
Ideally, I'm planning to fix it at the first value in y
list. In other words, what I'm really trying to do is I want the best fit line to pass through the first value in the y
list, which is 30 in my example.
But it seems like Scipy is deciding the y-intercept for me.
How can I fix the y-intercept to a particular value in scipy's linear regression method?
PS: I've also tried using statsmodels' OLS, but it only allows me to either stay at y-intercept=0 or let it decides the best intercept for me.
Upvotes: 1
Views: 2712
Reputation: 4657
In a polynomial equation such as a parabola:
Y = a + bX + cX^2
when X = 0, then Y = a and so if you are fitting a polynomial and can use a fixed value for a, you can make the Y intercept equal to any value you wish. Again using the example of a parabola, if you fit data to the equation:
Y = 7.5 + bX + cX^2
then the fitted intercept for the above equation will be 7.5.
Upvotes: 0
Reputation: 22897
In statsmodels you can shift y so the origin is at zero and exclude the intercept:
res = OLS(y - 30., x).fit()
where x contains the regressors without intercept (column of ones). Then the interpretation is that we predict the deviation from 30.
y_predicted = 30 + res.predict(...)
Almost all statistic, like bse, tvalues, pvalues and fit statistics like rsquared, are independent of the shift in location assuming the constant is fixed at the shift value.
Upvotes: 2