user3062260
user3062260

Reputation: 1644

statsmodels not returning correct gradient

I feel like I've missed something really simple that I'm sure someone will be able to correct fairly easily. I can't get pythons stats models to return the correct gradient for some reason. Here is my code:

import statsmodels.api as sm 

y_values = [10, 8, 6, 4, 2]
x_values = [1, 2, 3, 4, 5]

mod = sm.OLS(y_values, x_values)
fit = mod.fit()

print(fit.summary())

print(fit.params[0])

This answer here suggests I am getting the correct param for the model to be the gradient but it gives me an answer of "1.2727" which I don't understand? I was expecting this to give me a gradient of "-2.0" for this example case?

Upvotes: 1

Views: 290

Answers (2)

user3062260
user3062260

Reputation: 1644

Got there in the end with sklearn which is a bit more 'straight out of the box':

from sklearn.linear_model import LinearRegression

mod = LinearRegression().fit(x_values, y_values)
grad = mod.coef_[0][0]

print(grad)

Upvotes: 1

StupidWolf
StupidWolf

Reputation: 46888

In your example, the line will not pass through the origin, so the formula cannot be y = mx but y = mx + c , meaning you need an intercept:

mod = sm.OLS(y_values, sm.add_constant(x_values))
fit = mod.fit()
print(fit.params)

[12. -2.]

If this helps:

import seaborn as sns
sns.scatterplot(x=x_values,y=y_values)

enter image description here

Upvotes: 2

Related Questions