Tim Lindsey
Tim Lindsey

Reputation: 757

statsmodels add_constant for OLS intercept, what is this actually doing?

Reviewing linear regressions via statsmodels OLS fit I see you have to use add_constant to add a constant '1' to all your points in the independent variable(s) before fitting. However my only understanding of intercepts in this context would be the value of y for our line when our x equals 0, so I'm not clear what purpose always just injecting a '1' here serves. What is this constant actually telling the OLS fit?

Upvotes: 14

Views: 20256

Answers (2)

wi3o
wi3o

Reputation: 1617

statsmodels' sm.add_constant is the same as the parameter fit_intercept in scikit-learn's LinearRegression().

If you don't do sm.add_constant or if you do LinearRegression(fit_intercept=False), both algorithms assume that b = 0 in y = mx + b. Therefore, they will fit the model using b = 0 instead of calculating what b is supposed to be based on your data.

Upvotes: 11

BrenBarn
BrenBarn

Reputation: 251568

It doesn't add a constant to your values, it adds a constant term to the linear equation it is fitting. In the single-predictor case, it's the difference between fitting an a line y = mx to your data vs fitting y = mx + b.

Upvotes: 16

Related Questions