One vs all SVM introduces classses imbalance

For one-against-all approach, it depends on binary classifiers equal to the number of classes n. It assumes that a class is labeled as one while the rest of classes are labeled 0:

model = cell(numLabels,1);
for k=1:numLabels
    model{k} = svmtrain(double(trainLabel==k), trainData, '-c 1 -g 0.2 -b 1');
end

For a large number of classes and large number of images(5000 each class). This means in the the above code one class will be 5000 vs the rest of the dataset which (n-1)*5000. Is this imbalance acceptable?. I mean one vs one would be better to avoid imbalance or is it dependent on classification problem?is it acceptable to have such situation?How to know if that is causing me a problem?

Upvotes: 1

Answers (2)

lennon310

Reputation: 12689

The intuitive thinking of such imbalance issue is: the minority sample may have the risk of missing TP after training. However, the one-vs-all approach includes a sum-pooling procedure that the maximum classifier will be selected among all those "TP-missing" classifiers. So a minor imbalance won't cause huge problem.

Upvotes: 0

lejlot

Reputation: 66805

Yes, in general one vs one is a better solution in terms of balance, but it requires much higher computational and memory cost (as you need n^2 classifiers). So this is an often practise to train 1 vs all instead. To avoid this problem you can use "class weight" scheme, which ensures, that the smaller class gets more learner's focus (it is more costfull to missclassify the worse represented class).

Upvotes: 1

One vs all SVM introduces classses imbalance

Answers (2)

Related Questions