Reputation: 3037
When logging my validation loss inside validation_step()
in PyTorch Lighnting like this:
def validation_step(self, batch: Tuple[Tensor, Tensor], _batch_index: int) -> None:
inputs_batch, labels_batch = batch
outputs_batch = self(inputs_batch)
loss = self.criterion(outputs_batch, labels_batch)
self.log('loss (valid)', loss.item())
Then, I get an epoch-wise loss curve:
If I want the step-wise loss curve I can set on_step=True
:
def validation_step(self, batch: Tuple[Tensor, Tensor], _batch_index: int) -> None:
inputs_batch, labels_batch = batch
outputs_batch = self(inputs_batch)
loss = self.criterion(outputs_batch, labels_batch)
self.log('loss', loss.item(), on_step=True)
This results in step-wise loss curves for each epoch:
How can I get a single graph over all epochs instead? When running my training for thousands of epochs this gets messy.
Upvotes: 2
Views: 5414
Reputation: 497
It seems that you have done something wrong when init your logger. Is it defined as the following:
logger = TensorBoardLogger("tb_logs", name="my_model")
Note that on_step
will modify your tag which is one cause why they show up as separate images.
Instead of using on_step you can use:
self.logger.experiment.add_scalar('name',metric)
If you want the plots x axis to show number of epochs instead of steps you can place the logger within validation_epoch_end(self, outputs)
.
def validation_epoch_end(self, outputs):
avg_loss = torch.stack([x["val_loss"] for x in outputs]).mean()
self.logger.experiment.add_scalar('loss',avg_loss, self.current_epoch)
Upvotes: 1