Reputation: 1324
I try to visualizing multiple logistic regression but I get the above error.
I'm practicing on red wine quality data set from kaggle.
Here is a full traceback:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-88-230199fd3a97> in <module>
4 X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
5 np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
----> 6 plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
7 alpha = 0.75, cmap = ListedColormap(('red', 'green')))
8 plt.xlim(X1.min(), X1.max())
/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/base.py in predict(self, X)
287 Predicted class label per sample.
288 """
--> 289 scores = self.decision_function(X)
290 if len(scores.shape) == 1:
291 indices = (scores > 0).astype(np.int)
/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/base.py in decision_function(self, X)
268 if X.shape[1] != n_features:
269 raise ValueError("X has %d features per sample; expecting %d"
--> 270 % (X.shape[1], n_features))
271
272 scores = safe_sparse_dot(X, self.coef_.T,
ValueError: X has 2 features per sample; expecting 11
Below is the visualization code:
# Visualising the Training set results
from matplotlib.colors import ListedColormap
X_set, y_set = X_train, y_train
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Logistic Regression (Training set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()
Upvotes: 1
Views: 19078
Reputation: 33147
You can to add the full code to be sure about the problem but it seems that the model was trained using 11 features but now you are trying to predict using 2 features.
classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape))
Here, the shape of the np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape)
should be exactly the same across the column dimension (axis = 1) with the original array used for the training (.fit
) of the classifier
.
Upvotes: 2