czy
czy

Reputation: 513

How to resolve ValueError: The number of classes has to be greater than one; got 1 class

When I ran the following:

from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
clf = SVC(kernel='rbf', probability=True)
clf.fit(x_train, y_train)

I received the ValueError: The number of classes has to be greater than one; got 1 class

I did: print(np.unique(y_train)), which returned [0].

Can anyone point me in the right direction for a solution?

Upvotes: 3

Views: 3668

Answers (2)

Alexander L. Hayes
Alexander L. Hayes

Reputation: 4273

Using stratify with train_test_split makes this less likely:

train_test_split(X, y, stratify=y)

Explanation: train_test_split has randomness. It's possible to produce splits where y_train does not contain positive and negative examples, which means we cannot train a discriminative classifier:

from sklearn.model_selection import train_test_split
import numpy as np

X = np.ones((8, 2))
y = np.array([0, 0, 0, 0, 0, 0, 1, 1])

_, _, y_train, y_test = train_test_split(X, y, random_state=33)
# y_train:   [0 0 0 0 0 0]    # <--- Uh oh, there are no 1s in the training set!
# y_test:    [1 1]

Stratification first separates data based on the label. This means our training data should have at least one of each label:

_, _, y_train, y_test = train_test_split(X, y, stratify=y)
# y_train:   [0 0 0 0 1 0]
# y_test:    [0 1]

Upvotes: 0

ultrapoci
ultrapoci

Reputation: 335

Either your y list contains no 1's, or the 1's in y are few enough that y_train may end up containing no 1s. You should print y, and in case it contains 1's, you need to change your splitting strategy to ensure that all classes are present in y_train and y_test at least once

Upvotes: 1

Related Questions