Reputation: 2636
I'm training a network which has multiple losses and both creating and feeding the data into my network using a generator.
I've checked the structure of the data and it looks fine generally and it also trains pretty much as expected the majority of the time, however at a random epoch almost every time, the the training loss for every prediction suddenly jumps from say
# End of epoch 3
loss: 2.8845
to
# Beginning of epoch 4
loss: 1.1921e-07
I thought it could be the data, however, from what I can tell the data is generally fine and it's even more suspicious because this will happen at a random epoch (could be because of a random data point chosen during SGD?) but will persist throughout the rest of training. As in if at epoch 3, the training loss decreases to 1.1921e-07
then it will continue this way in epoch 4, epoch 5, etc.
However, there are times when it reaches epoch 5 and hasn't done this yet and then might do it at epoch 6 or 7.
Is there any viable reason outside of the data that could cause this? Could it even happen that a few fudgy data points causes this so fast?
Thanks
EDIT:
Results:
300/300 [==============================] - 339s - loss: 3.2912 - loss_1: 1.8683 - loss_2: 9.1352 - loss_3: 5.9845 -
val_loss: 1.1921e-07 - val_loss_1: 1.1921e-07 - val_loss_2: 1.1921e-07 - val_loss_3: 1.1921e-07
The next epochs after this all have trainig loss 1.1921e-07
Upvotes: 5
Views: 1570
Reputation: 2636
Not entirely sure how satisfactory this is as an answer but my findings seem to show that using multiple categorical_crossentropy loss's together seems to result in a super unstable network? Swapping this out for other loss functions fixes the problem with the data remaining unchanged.
Upvotes: 1