orion24
orion24

Reputation: 69

Getting error: Shapes not aligned, with statsmodels and simple 2 dimensional linear regression

import numpy as np
import statsmodels.api as sm


list21 = [-0.77, -0.625, -0.264, 0.888, 1.8, 2.411, 2.263, 2.23, 1.981, 2.708]
list23 = [-1.203, -1.264, -1.003, -0.388, -0.154, -0.129, -0.282, -0.017, -0.06, 0.275]

X1 = np.asarray(list21)
Y1 = np.asarray(list23)
    
x = X1.reshape(-1, 1)
y = Y1.reshape(-1, 1)

   
model = sm.OLS(x, y)
fit = model.fit()

y_pred = model.predict(x)

Error reads as:

--> 161     y_pred = model.predict(x)

ValueError: shapes (10,1) and (10,1) not aligned: 1 (dim 1) != 499 (dim 0)

Been banging my head against the wall for the past half hour please help.

Upvotes: 2

Views: 5036

Answers (1)

Stefan
Stefan

Reputation: 957

You are assigning the predict to the wrong variable. Use:

model = sm.OLS(x, y)
fit = model.fit()
y_pred = fit.predict(x)

Or use

model = sm.OLS(x, y).fit()
y_pred = model.predict(x)

In either case: assign predict to the variable you used with fit()

EDIT

To answer your question why the line goes through zero: You are not defining an intercept, which you can do with sm.add_constant. Please refer to this documentation: https://www.statsmodels.org/dev/examples/notebooks/generated/ols.html

Applied to your code you get:

import numpy as np
import statsmodels.api as sm
import matplotlib.pyplot as plt

list21 = [-0.77, -0.625, -0.264, 0.888, 1.8, 2.411, 2.263, 2.23, 1.981, 2.708]
list23 = [-1.203, -1.264, -1.003, -0.388, -0.154, -0.129, -0.282, -0.017, -0.06, 0.275]

x = np.asarray(list21)
y = np.asarray(list23)
X = sm.add_constant(x)
model = sm.OLS(y,X)
results = model.fit()
y_pred = results.predict(X)
plt.scatter(list21,list23)
plt.plot(x,y_pred)

Upvotes: 3

Related Questions