Reputation: 437
I am using custom loss function(triplet loss) with mini-batch, during epoch the loss is gradually decreasing but just after the every epoch there is sudden drop in loss(appx. 10% of fall) and then gradually decreasing during that epoch(ignore accuracy). Is it normal?
Every answer and reference to this problem will be appreciated.
Epoch 1/5 198/198 [==============================] - 3299s 17s/step - loss: 0.2500 - acc: 0.0014 Epoch 2/5 99/198 [==============>...............] - ETA: 26:16 - loss: 0.1220 - acc: 0.0016
Upvotes: 7
Views: 5971
Reputation: 15
I noticed the same pattern training a model with triplet loss using Pytorch. Since the evaluation loss didn't drop in the same manner, I attributed it to the fact that the model has already seen and adjusted its parameters with respect to the loss from these samples. The model already learned from these triplets once, so it will do a better job on the difference between Anchor-Positive distance and Anchor-Negative distance. Of course, this only applies if you're using the same triplets in each epoch. If you're doing online triplet mining and completely different triplets can come up in each epoch then there's something else going on.
Upvotes: 0
Reputation: 1
Fluctuations of loss within the epoch (i.e Running loss) are fine as long as the average loss over an epoch keeps on decreasing for further epochs. About why there is a sudden jump, maybe its converging very quickly.
Upvotes: 0
Reputation: 10475
Note: This answer is assuming you are using Keras -- you might want to add this information to your post or at least add a relevant tag.
Yes, this is because the displayed values are averaged over the epoch. Consider epoch 1. At the beginning of training, the loss will usually be quite large. It will then decrease, but the displayed value for epoch 1 will still include those large values from the beginning in the average. For example, let's say the loss in the beginning is 0.75 and decreases linearly to 0.25 until the end of the first epoch; this would mean an average of 0.5 which would be the value shown for epoch 1.
Once epoch 2 starts, the average is reset and will be computed again for this epoch. Let's continue with the example, so the loss is 0.25 at the beginning of epoch 2 and decreases linearly to 0. This means the loss displayed for epoch 2 will be 0.125! More importantly however, it will start at 0.25, and so already at the beginning of the epoch you will see a large drop from the value of 0.5 shown for epoch 1.
Upvotes: 17