Reputation: 45941
I'm using this:
Python version: 3.7.7 (default, May 6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)]
TensorFlow version: 2.1.0
Eager execution: True
With this U-Net model:
inputs = Input(shape=img_shape)
conv1 = Conv2D(64, (5, 5), activation='relu', padding='same', data_format="channels_last", name='conv1_1')(inputs)
conv1 = Conv2D(64, (5, 5), activation='relu', padding='same', data_format="channels_last", name='conv1_2')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2), data_format="channels_last", name='pool1')(conv1)
conv2 = Conv2D(96, (3, 3), activation='relu', padding='same', data_format="channels_last", name='conv2_1')(pool1)
conv2 = Conv2D(96, (3, 3), activation='relu', padding='same', data_format="channels_last", name='conv2_2')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2), data_format="channels_last", name='pool2')(conv2)
conv3 = Conv2D(128, (3, 3), activation='relu', padding='same', data_format="channels_last", name='conv3_1')(pool2)
conv3 = Conv2D(128, (3, 3), activation='relu', padding='same', data_format="channels_last", name='conv3_2')(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2), data_format="channels_last", name='pool3')(conv3)
conv4 = Conv2D(256, (3, 3), activation='relu', padding='same', data_format="channels_last", name='conv4_1')(pool3)
conv4 = Conv2D(256, (4, 4), activation='relu', padding='same', data_format="channels_last", name='conv4_2')(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2), data_format="channels_last", name='pool4')(conv4)
conv5 = Conv2D(512, (3, 3), activation='relu', padding='same', data_format="channels_last", name='conv5_1')(pool4)
conv5 = Conv2D(512, (3, 3), activation='relu', padding='same', data_format="channels_last", name='conv5_2')(conv5)
up_conv5 = UpSampling2D(size=(2, 2), data_format="channels_last", name='up_conv5')(conv5)
ch, cw = get_crop_shape(conv4, up_conv5)
crop_conv4 = Cropping2D(cropping=(ch, cw), data_format="channels_last", name='crop_conv4')(conv4)
up6 = concatenate([up_conv5, crop_conv4])
conv6 = Conv2D(256, (3, 3), activation='relu', padding='same', data_format="channels_last", name='conv6_1')(up6)
conv6 = Conv2D(256, (3, 3), activation='relu', padding='same', data_format="channels_last", name='conv6_2')(conv6)
up_conv6 = UpSampling2D(size=(2, 2), data_format="channels_last", name='up_conv6')(conv6)
ch, cw = get_crop_shape(conv3, up_conv6)
crop_conv3 = Cropping2D(cropping=(ch, cw), data_format="channels_last", name='crop_conv3')(conv3)
up7 = concatenate([up_conv6, crop_conv3])
conv7 = Conv2D(128, (3, 3), activation='relu', padding='same', data_format="channels_last", name='conv7_1')(up7)
conv7 = Conv2D(128, (3, 3), activation='relu', padding='same', data_format="channels_last", name='conv7_2')(conv7)
up_conv7 = UpSampling2D(size=(2, 2), data_format="channels_last", name='up_conv7')(conv7)
ch, cw = get_crop_shape(conv2, up_conv7)
crop_conv2 = Cropping2D(cropping=(ch, cw), data_format="channels_last", name='crop_conv2')(conv2)
up8 = concatenate([up_conv7, crop_conv2])
conv8 = Conv2D(96, (3, 3), activation='relu', padding='same', data_format="channels_last", name='conv8_1')(up8)
conv8 = Conv2D(96, (3, 3), activation='relu', padding='same', data_format="channels_last", name='conv8_2')(conv8)
up_conv8 = UpSampling2D(size=(2, 2), data_format="channels_last", name='up_conv8')(conv8)
ch, cw = get_crop_shape(conv1, up_conv8)
crop_conv1 = Cropping2D(cropping=(ch, cw), data_format="channels_last", name='crop_conv1')(conv1)
up9 = concatenate([up_conv8, crop_conv1])
conv9 = Conv2D(64, (3, 3), activation='relu', padding='same', data_format="channels_last", name='conv9_1')(up9)
conv9 = Conv2D(64, (3, 3), activation='relu', padding='same', data_format="channels_last", name='conv9_2')(conv9)
ch, cw = get_crop_shape(inputs, conv9)
conv9 = ZeroPadding2D(padding=(ch, cw), data_format="channels_last", name='conv9_3')(conv9)
conv10 = Conv2D(1, (1, 1), activation='sigmoid', data_format="channels_last", name='conv10_1')(conv9)
model = Model(inputs=inputs, outputs=conv10)
And with this functions:
def dice_coef(y_true, y_pred):
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
return (2.0 * intersection + 1.0) / (K.sum(y_true_f) + K.sum(y_pred_f) + 1.0)
def dice_coef_loss(y_true, y_pred):
return 1-dice_coef(y_true, y_pred)
To compile the model I do:
model.compile(tf.keras.optimizers.Adam(lr=(1e-4) * 2), loss=dice_coef_loss, metrics=[dice_coef])
And I get this output while training:
Epoch 1/2 5/5 [==============================] - 8s 2s/sample - loss: 1.0000 - dice_coef: 4.5962e-05 - val_loss: 0.9929 - val_dice_coef: 0.0071 Epoch 2/2 5/5 [==============================] - 5s 977ms/sample - loss: 0.9703 - dice_coef: 0.0297 - val_loss: 0.9939 - val_dice_coef: 0.0061 Train on 5 samples, validate on 5 samples
I think the idea is to get a loss closely to zero, but I don't understand the 1.000
I get (maybe it is worst loss value that I can get). But I don't understand the dice_coef value.
What does dice_coef
value mean?
Upvotes: 0
Views: 221
Reputation: 445
Dice loss is a loss function that prevents some of the limitations present in the ordinary Cross Entropy loss.
When using cross entropy loss, the statistical distributions of labels play a big role in training accuracy. The more unbalanced the label distributions are, the more difficult the training will be. Although weighted cross entropy loss can alleviate the difficulty, the improvement is not significant nor the intrinsic issue of cross entropy loss is solved. In cross entropy loss, the loss is calculated as the average of per-pixel loss, and the per-pixel loss is calculated discretely, without knowing whether its adjacent pixels are boundaries or not. As a result, cross entropy loss only considers loss in a micro sense rather than considering it globally, which is not enough for image level prediction.
Dice Coef function can be described as:
which is clearly what your funtion dice_coef(y_true, y_pred)
is calculating. More about that Sørensen–Dice coefficient
In the equation above p_i
and g_i
are pairs of corresponding pixel values of prediction and ground truth, respectively. In boundary detection scenario, their values are either 0 or 1, representing whether the pixel is boundary (value of 1) or not (value of 0). The denominator is the sum of total boundary pixels of both prediction and ground truth, and the numerator is the sum of correctly predicted boundary pixels because the sum increments only when pi and gi match (both of value 1).
The denominator considers the total number of boundary pixels at global scale, while the numerator considers the overlap between the two sets at local scale. Therefore, Dice loss considers the loss information both locally and globally, which is critical for high accuracy.
Regarding your training, since your loss value is decreasing through the training, you shouldn't worry that much, try increasing the number of epoch and analyse the network as it goes through the model.
the dice loss is simply 1 - dice coef
. which is what your function is calculating.
Upvotes: 1