Reputation: 1373
I have tensors of BxCxHxW
where B
is the batch. I am interested in how is the loss implemented in PyTorch when batch size is more than 1. The below is what we usually do in PyTorch,
l1_loss = torch.nn.L1Loss()
loss = l1_loss(pred,gt)
Is it that the loss
is the average loss over the batch?
If this is indeed the case, then is the following codes equivalent to what is we usually do in PyTorch (the code above)?
l1_loss = torch.nn.L1Loss()
for idx in range(B):
if idx==0:
loss = l1_loss(pred[idx,:,:,:],gt[idx,:,:,:])
else:
loss = l1_loss(pred[idx,:,:,:],gt[idx,:,:,:]) + loss
loss = loss/B
Upvotes: 2
Views: 2089
Reputation: 785
If your question is do they do "something" like what you've mentioned, then the answer is yes. However not exactly like that, they maintain their architectural design (Object Oriented), but their default way of calculation of the loss is as you've guessed along with the batch dimension. So you could make your syntax more compact with something like this-
def l1_loss(output, target):
loss = torch.abs(output - target).sum(dim=1).mean(dim=0)
return loss
Upvotes: 1
Reputation: 3553
The documentation describes the behavior of L1loss
: it is indeed (by default) the mean over the whole batch. You can change it easily to the sum instead :
l1_loss = torch.nn.L1Loss(reduction='sum')
Yes your code is equivalent to what Pytorch does. A version without the call to L1loss
would be :
# Assuming your output is a vector of shape (B, F)
loss = torch.abs(pred - gt).sum(dim=1).mean(dim=0)
Upvotes: 2