Amzzz
Amzzz

Reputation: 55

ValueError: Found input variables with inconsistent numbers of samples: [676, 540]

X_train, X_test, y_train, y_test = train_test_split(features, df['Label'], test_size=0.2, random_state=111)
print (X_train.shape) # (540, 4196)
print (X_test.shape) # (136, 4196)
print (y_train.shape) # (540,)
print (y_test.shape) # (136,)

When fitting, it gives error:

from sklearn.svm import SVC
classifier = SVC(random_state = 0)
classifier.fit(features,y_train)
y_pred = classifier.predict(features)

Error:

ValueError: Found input variables with inconsistent numbers of samples: [676, 540]

I tried this.

Upvotes: 0

Views: 11341

Answers (2)

Jan Jaap Meijerink
Jan Jaap Meijerink

Reputation: 427

You want to call the fit function with you X_train, not with features. The error occurs because features and y_train don't have the same size.

X_train, X_test, y_train, y_test = train_test_split(features, df['Label'], test_size=0.2, random_state=111)
print (X_train.shape)
print (X_test.shape)
print (y_train.shape)
print (y_test.shape)

from sklearn.svm import SVC
classifier = SVC(random_state = 0)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)

You'll likely also want to call predict with X_test or X_train. You may want to learn a bit more about train/test splits and why they are used.

Upvotes: 2

Francisco Parrilla
Francisco Parrilla

Reputation: 513

Why are you using the features along y_train for the .fit()? I think you are supposed to use X_train instead.

Instead of

classifier.fit(features, y_train)

Use:

classifier.fit(X_train, y_train)

You are trying to use two sets of data with different shape, since you did the split earlier. So features has more samples than y_train.

Also, for you predict line. It should be:

.predict(x_test)

Upvotes: 1

Related Questions