Reputation: 1
I use pytorch lightning to train a model and report loss after each iteration into the tensorboard logger. Then I ended up having two losses reported; one is train_loss_epoch, and the other is train_loss_step. I am just wondering how the train_loss_epoch is calculated precisely. Does it take the average losses of all global steps in one epoch?
I attached the two losses here. From the Train_loss_step, I cannot observe the training tends to converges, but from the Train_loss_epoch, it tends to go down. I am also confused with it. Was my training effective?
[Train_loss_epoch]
[Train_loss_step]
So many thanks for any feedback and thoughts.
Upvotes: 0
Views: 37