Reputation: 91
Appologizes for misuse of technical terms. I am working on a project of semantic segmentation via CNNs ; trying to implement an architecture of type Encoder-Decoder, therefore output is the same size as the input.
How do you design the labels ? What loss function should one apply ? Especially in the situation of heavy class inbalance (but the ratio between the classes is variable from image to image).
The problem deals with two classes (objects of interest and background). I am using Keras with tensorflow backend.
So far, I am going with designing expected outputs to be the same dimensions as the input images, applying pixel-wise labeling. Final layer of model has either softmax activation (for 2 classes), or sigmoid activation ( to express probability that the pixels belong to the objects class). I am having trouble with designing a suitable objective function for such a task, of type:
function(y_pred,y_true),
in agreement with Keras.
Please,try to be specific with the dimensions of tensors involved (input/output of the model). Any thoughts and suggestions are much appreciated. Thank you !
Upvotes: 4
Views: 859
Reputation: 111
Two ways :
You could try 'flattening':
model.add(Reshape(NUM_CLASSES,HEIGHT*WIDTH)) #shape : HEIGHT x WIDTH x NUM_CLASSES
model.add(Permute(2,1)) # now itll be NUM_CLASSES x HEIGHT x WIDTH
#Use some activation here- model.activation()
#You can use Global averaging or Softmax
One hot encoding every pixel:
In this case your final layer should Upsample/Unpool/Deconvolve to HEIGHT x WIDTH x CLASSES. So your output is essentially of the shape: (HEIGHT,WIDTH,NUM_CLASSES).
Upvotes: 1
Reputation: 7148
I suggest starting with a base architecture used in practice like this one in nerve-segmentation: https://github.com/EdwardTyantov/ultrasound-nerve-segmentation. Here a dice_loss is used as a loss function. This works very well for a two class problem as has been shown in literature: https://arxiv.org/pdf/1608.04117.pdf.
Another loss function that has been widely used is cross entropy for such a problem. For problems like yours most commonly long and short skip connections are deployed to stabilize training as denoted in the paper above.
Upvotes: 1
Reputation: 40506
Actually when you use a TensorFlow
backend you could simply apply a predefined Keras
objectives in a following manner:
output = Convolution2D(number_of_classes, # 1 for binary case
filter_height,
filter_width,
activation = "softmax")(input_to_output) # or "sigmoid" for binary
...
model.compile(loss = "categorical_crossentropy", ...) # or "binary_crossentropy" for binary
And then feed either a one-hot encoded feature map or matrix of shape (image_height, image_width)
with integer encoded classes (remember than in this case you should use sparse_categorical_crossentropy
as a loss).
To deal with a class inbalance (I guess it's beacuse of a backgroud class) I strongly recommend you to read carefully answers to this Stack Overflow question.
Upvotes: 1