Multilabel image classification: is it necessary to have training data for each combination of labels?

Question

I want to train a CNN for a multilabel image classification task using keras. However I am not sure how to prepare my tranining data. More specifically, I am wondering if I need training images that show a combination of two or more labels or if it is sufficient to train the network on single labels and it will then be able to detect multiple labels within an image.

I think my question becomes clearer with an example:

Say I am using the dog vs cat classification dataset and I want to build a model that is able to classify images as either being a dog or a cat or seeing both animals in one image. In this case, do I need to train the model with images showing cats, dogs AND images that show both in one image or is it sufficient to only have training images that only display cats and dogs?

Marcin Możejko · Accepted Answer

Well - in a case when you have multiple classes possible - your problem changes from multiclass classification (assigning one class to an image) to multiclassification (assigning multiple decisions to an image). E.g. given your example output of your network should be two-dimensional with separate output for each class:

output = Dense(nb_of_classes, activation='sigmoid')(previous_layer)
model.compile(loss='binary_crossentropy`, ..)

As you may see - you are actually training two separate classifiers instead of one. From my experience - it should work fine, although having examples with both classes present - makes training more efficient.

Multilabel image classification: is it necessary to have training data for each combination of labels?

Answers (2)

Related Questions