Reputation: 375
I am trying to train a U-Net derivative to do single-class image segmentation but am having problems using the tf.keras.losses.SparseCategoricalCrossentropy()
and tf.keras.losses.CategoricalCrossentropy()
functions in Keras. Which is the more appropriate and how to use it properly?
If I try to use SpareCategoricalCrossentropy
, I get the error:
Received a label value of 1 which is outside the valid range of [0, 1)
If I try to use CategoricalCrossentropy
, I get:
You are passing a target array of shape
(3600, 64, 64, 1)
while using as losscategorical_crossentropy
.categorical_crossentropy
expects targets to be binary matrices (1s and 0s) of shape (samples, classes). If your targets are integer classes, you can convert them to the expected format via:y_binary = tf.keras.utils.to_categorical(y_int)
Using to_categorical
for my mask vs background segmentation problem, it increases the last dimension to 2, which should not be necessary. My prediction should be a number between 0 and 1 in a single "channel".
Model definition snippet:
input_x = tf.keras.Input(batch_shape=(batch_size, xsze, ysze, 3), name='input_x')
predictions = tf.keras.layers.Conv2D(1, [1, 1], activation='linear', name='output_x')(drop11)
loss = tf.keras.losses.SparseCategoricalCrossentropy()
model.compile(optimizer=tf.keras.optimizers.Adam(), # Optimizer
loss=loss,
metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])
checkpointer = tf.keras.callbacks.ModelCheckpoint(session_name + '_backup.h5', save_best_only=True, monitor = 'acc', verbose = 0)
early_stopper = tf.keras.callbacks.EarlyStopping(monitor='val_loss',patience=5, verbose=1,min_delta=0.005)
history = model.fit(data_train, roi_train,
batch_size=batch_size,
epochs = 10,
validation_data=(data_val, roi_zoom_val),callbacks=[checkpointer,early_stopper])
My roi_train
is a numpy array with 0's and 1's of type float32
.
Upvotes: 0
Views: 1005
Reputation: 33420
Since you only have one class and you want each value in the segmentation map to be between 0 and 1, then you should use sigmoid
as the activation of last layer and binary_crossentropy
as the loss function. That's because for each pixel you are facing a binary decision: does this pixel belong to foreground or background?
Upvotes: 1