Reputation: 227
I wrote a linear regression model with a single variable, but it raises a value error after running the following code
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression as lr
import numpy as np
x=np.array([0,1,2,3,4,5,6,7,8,9])
y=np.array([1,3,2,5,7,8,8,9,10,12])
reg=lr().fit(x.reshape(10,1),y.reshape(10,1))
y_l = reg.intercept_ + reg.coef_ *x
plt.plot(x,y_l)
plt.show()
I reshaped the numpy array x by using x.reshape(10,1) in the linear equation. Then it did not raise any value error. But I don't know the reason behind this.
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression as lr
import numpy as np
x=np.array([0,1,2,3,4,5,6,7,8,9])
y=np.array([1,3,2,5,7,8,8,9,10,12])
reg=lr().fit(x.reshape(10,1),y.reshape(10,1))
y_l = reg.intercept_ + reg.coef_ *x.reshape(10,1)
plt.plot(x,y_l)
plt.show()
Can anyone help me with this? Thanks in advance.
Upvotes: 1
Views: 246
Reputation: 18367
This happens because of multiplying the np.array
with the 2D array reg.coef_
with length (n_features). In order to multiply these elements, you need to either reshape the np.array or reshape the 2D array reg.coef_
into a similar fashion.
This should also work:
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression as lr
import numpy as np
x=np.array([0,1,2,3,4,5,6,7,8,9])
y=np.array([1,3,2,5,7,8,8,9,10,12])
reg=lr().fit(x.reshape(10,1),y.reshape(10,1))
y_l = reg.intercept_ + reg.coef_.reshape(1)*x
plt.plot(x,y_l)
plt.show()
print(reg.coef_.shape)
Upvotes: 0
Reputation: 7509
reg.coef_
is a 2D array - with shape (1, 1)
in this case. it's always 2D in order to account for multiple coefficients when using multiple linear regression.
Broadcasting rules makes the expression reg.coef_ * x
return a 2D array, resulting in the error you see.
In your case, I'd say the cleanest expression to fix this is:
y_l = reg.intercept_ + reg.coef_.reshape(1) * x
Upvotes: 1