Akhmad Zaki
Akhmad Zaki

Reputation: 433

Python Scikit-Learn SVM - No Predicted Samples for a Class

I am doing a classification task in Python to classify audio files of different musical instrument into their respective class, in my case there are 4 class, which are Brass, String, Percussion, and Woodwind. I used SVM algorithm as the classifier. My code looks a bit like this (I do not change any parameter for the classifier):

#X is feature matrix, y is class vector
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

#SVM Classifier
svm = SVC()
svm.fit(X_train,y_train)
svm_pred = svm.predict(X_test)
print(metrics.classification_report(y_test,svm_pred)

When I try to run this code, I got problem with the classifier. The error code looks like this:

            precision  recall   f1-score   support

Brass         1.00      0.21      0.34        72
Percussion    0.38      1.00      0.55       279
String        1.00      0.15      0.26       276
Woodwind      0.00      0.00      0.00       156

avg / total   0.58      0.43      0.32       783

C:\Users\Anaconda3\lib\site-packages\sklearn\metrics\classification.py:1135: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.

When I checked my predicted labels from SVM classifier(svm_pred), no Woodwind class are predicted by the classifier

>>> set(svm_pred)
{'Brass','String','Percussion'}

My number of samples for each class are like this: Brass = 200 samples, Woodwind = 500 samples, Percussion = 900 samples, and String = 800 samples so it is a bit imbalanced

My question is, is it possible for a SVM classifier to not predict a class at all in the output of the classifier like my case above?

Upvotes: 0

Views: 1389

Answers (2)

May Pilijay El
May Pilijay El

Reputation: 21

Another problem might be due to the fact that if you do not stratify while splitting your dataset, it might be that some of the folds do not contain one class at all, while others do. Try using the option stratify=y to solve the problem :)

Upvotes: 2

KRKirov
KRKirov

Reputation: 4004

If Woodwind is as well represented in the training set as in the testing set, my guess would be that your model is completely off and therefore does not predict this class. Try scaling of any numerical features using sklearns scale()

http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html

and the different kernel options of the SVM classifier

http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html

Upvotes: 1

Related Questions