Reputation: 810
I am reading Pytorch official tutorial for fine tuning and I am faced with one problem and that is calculation of loss in each epoch.
Before this , I calculate loss for batch of data, accumulate these batch losses and find mean of these values as loss of epoch. But in that example, the calculation is as follow:
for inputs, labels in dataloaders[phase]:
inputs = inputs.to(device)
labels = labels.to(device)
# zero the parameter gradients
optimizer.zero_grad()
# forward
# track history if only in train
with torch.set_grad_enabled(phase == 'train'):
outputs = model(inputs)
_, preds = torch.max(outputs, 1)
loss = criterion(outputs, labels)
# backward + optimize only if in training phase
if phase == 'train':
loss.backward()
optimizer.step()
# statistics
running_loss += loss.item() * inputs.size(0)
running_corrects += torch.sum(preds == labels.data)
my question is in this line running_loss += loss.item() * inputs.size(0)
. It is multiply loss value of batch in bach size. What is the true way to calculate loss of epoch?
and what is the unit of loss? What is the range of loss value?
Upvotes: 2
Views: 2502
Reputation: 1098
Yes the code snippet adds multiplication of batch size with batch mean error. If you want to calculate true summation. You can use
torch.nn.CrossEntropyLoss(reduction = "sum")
which will give you the sum of errors for the batch. Then you can directly sum for each batch as follows:
running_loss += loss.item()
The range of the loss value depends on your number of classes and feature vector. The code in your question will have same running_loss if you use reduction="sum"
because your code basically makes
(loss/batch_size) * batch_size
which is the same thing with loss value. However, backpropagation changes because on the one hand you backprop according to the sum of losses, on the other hand you calculate backprop according to the mean loss.
Upvotes: 1