samra irshad
samra irshad

Reputation: 213

Dice loss becomes NAN after some epochs

I am working on an image-segmentation application where the loss function is Dice loss. The issue is the the loss function becomes NAN after some epochs. I am doing 5-fold cross validation and checking validation and training losses for each fold. For some folds, the loss quickly becomes NAN and for some folds, it takes a while to reach it to NAN. I have inserted a constant in loss function formulation to avoid over/under-flow but still it the same problem occurs. My inputs are scaled within range [-1, 1]. I have seen people suggested using regularizers and different optimizers but I dont understand why the loss gets to NAN at first place. I have pasted the loss function, and training and validation losses for some epochs below. Initially only the validation loss and dice score for validation loss becomes NAN, but later all metrics becomes NAN.

def dice_loss(y_true, y_pred): #y_true--> ground-truth, y_pred-->predictions
smooth=1.
y_true_f = tf.keras.backend.flatten(y_true)
y_pred_f = tf.keras.backend.flatten(y_pred)
intersection = tf.keras.backend.sum(y_true_f * y_pred_f)
return 1-(2. * intersection +smooth) / (tf.keras.backend.sum(y_true_f) +
                                       tf.keras.backend.sum(y_pred_f) +smooth)
epoch   train_dice_score      train_loss    val_dice_score  val_loss
0       0.42387727            0.423877264   0.35388064      0.353880603
1       0.23064087            0.230640889   0.21502239      0.215022382
2       0.17881058            0.178810576   0.1767999       0.176799848
3       0.15746565            0.157465705   0.16138957      0.161389555
4       0.13828343            0.138283484   0.12770002      0.127699989
5       0.10434002            0.104340041   0.0981831       0.098183098
6       0.08013707            0.080137035   0.08188484      0.081884826
7       0.07081806            0.070818066   0.070421465     0.070421467
8       0.058371827           0.058371854   0.060712796     0.060712777
9       0.06381426            0.063814262   nan             nan
10      0.105625264           0.105625251   nan             nan
11      0.10790708            0.107907102   nan nan
12      0.10719114            0.10719115    nan nan


Upvotes: 3

Views: 2781

Answers (1)

Omer Savran
Omer Savran

Reputation: 31

I was getting same problem with my segmentation model too. I got that problem when I use both of dice loss and weighted cross entropy loss. I found a solution if somebody still has a same problem.

I was focusing my custom loss but then I figure out nan value came from inside of model when calculation time. Because of relu, inner values becomes to high then become nan.

To solve this I use batch normalization after every convolution with relu and it worked for me.

Upvotes: 3

Related Questions