jonas smith
jonas smith

Reputation: 555

NN Model Architecture Per-Pixel Classification

I'm familiar with how (C)NNs work in general for classification problems (2d image -> 1 class), but I don't know how to structure a network that will take a 2d image and output a 2d matrix of classification values.

Effectively, I have a set of NxN images (1 channel) that I want to classify on a "per-pixel" basis. I want the output to be an NxN set of classes such that for a pixel at location (a,b), the result will be the classification result for pixel (a,b) in the input image.

Any help of model architecture?

P.S. I've heard of patch based methods of doing this, but I want to feed the entire NxN image into the network without "patching".

Thanks! Joe

Upvotes: 2

Views: 644

Answers (1)

Maksim Khaitovich
Maksim Khaitovich

Reputation: 4792

In general there is nothing special about this task. DNNs could perform several classification or regression tasks simultaneously. And since the weights are shared you could ensure that during classification of each of your pixels there is some link between one pixel and all other pixels.

So long story short, possible algorithm to tackle this problem:

1) Ensure that you have a training set where as input you have NxN images, as output you have NxN matrix of target class labels (class label for each pixel)

2) Build some DNN with an architecture you would usually use for image classification. Like couple of convolution layers with max-pooling, followed by 2-3 fully-connected ReLU layers.

3) Ensure that your output layer will have size NxN and is not a softmax layer (use ReLU again)

4) Train it!

This should work absolutely fine. I also could reassure you that it is not uncommon to get several semi-independent outputs from DNN. For example here same network is used to locate all facial keypoints at the same time (eyes, mouth, nose etc).

Upvotes: 2

Related Questions