Reputation: 680
I'm putting together a simple OLS example from sklearn, and I'm noticing strange results. Below is the results
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LinearRegression
model = Pipeline([('linear', LinearRegression(fit_intercept=True))])
n = 100
x = np.linspace(0, 10, n)
eps = np.random.randn(n)
y = 0.5 * x + -2.5 + eps
model.fit(y.reshape(-1, 1), x.reshape(-1, 1))
yhat = model.predict(x.reshape(-1, 1))
plt.scatter(x, y)
plt.plot(x, yhat, 'r')
It seems strange that the OLS fit is so off. Just looking for someone else to reproduce this before posting this on the main sklearn issue tracker. My versions are below
sklearn=0.22.1
python=3.6.1
Upvotes: 1
Views: 294
Reputation: 856
Here is where you did wrong
model.fit(y.reshape(-1, 1), x.reshape(-1, 1))
change to:
model.fit(x.reshape(-1, 1), y.reshape(-1, 1))
Upvotes: 2