Reputation: 141
Using a CNN, I would like to take an image where each pixel is annotated with 3 labels. Something like this:
0,1 (not object/object)
0,1,2,3... (Class of object, eg cat,dog)
0,1,2,3...(Object Number of given class eg, 2nd instance of cat)
In other words given a picture of multiple cats and dogs the CNN would output that a given pixel is from an object, that object is a cat and it belongs to the second instance of cat in the image (counting from the top left hand corner for example).
Is this possible to do with a single CNN or would I have to combine multiple CNN's to achieve this result?
EDIT: I should note I understand I would initially have to train the CNN with annotated images where each pixel already has 2 or 3 labels as above.
Upvotes: 0
Views: 621
Reputation: 3649
You should look into Fully Convolutional Neural Networks. Basically, these are CNNs without Fully Connected layers, they contain deconvolution layers instead. So, given a NxN sized image it outputs a NxN sized image, each pixel having a label for itself, which has a direct application in semantic segmentation.
Upvotes: 1