Dmitriy R. Starson
Dmitriy R. Starson

Reputation: 1918

Classify visually distinct objects as one class

We are building a neural network to classify objects and have a large dataset of images for 1000 classes. One of the classes is “banana” and it contains 1000 images of banana. Some of those images (about 10%) are of mashed bananas, which are visually very different from the rest of the images in that class.

If we want both mashed bananas and regular bananas to be classified, should we split the banana images into two separate classes and train separately, or keep the two subsets merged?

I am trying to understand how the presence of a visually distinct subclass impacts the recognition of a given class.

Upvotes: 1

Views: 172

Answers (2)

MSalters
MSalters

Reputation: 179991

The problem here is simple. You need your neural network to learn both groups of images. That means you need to back-propagate sensible error information. If you do have the ground truth information about mashed bananas, back-propagating that is definitely useful. It helps the first layers learn two sets of features.

Note that the nice thing about neural networks is that you can back-propagate any kind of error vector. If your output has 3 nodes banana, non-mashed banana, mashed banana, you basically sidestep the binary choice implied in your question. You can always drop output nodes during inference.

Upvotes: 2

KonstantinosKokos
KonstantinosKokos

Reputation: 3473

There is no standard answer to be given here; it might be very hard for your network to generalize over classes if their subclasses are distinct in the feature space, in which case introducing multiple dummy classes that you collapse into a single one via post-processing would be the ideal solution. You could also pretrain a model with distinct classes (so as to build representations that discriminate between them), and then pop the final network layer (the classifier) and replace it with a collapsed classifier, fitting with the initial labels. This would accomplish having discriminating representations which are simply classified commonly. In any case I would advise you to construct the subclass-specific labels and check per-subclass error while training with the original classes; this way you will be able to quantify the prediction error you get and avoid over-engineering your network in case it can learn the task by itself without stricter supervision.

Upvotes: 1

Related Questions