aa1
aa1

Reputation: 803

How to use backpropagation with a self-defined loss in pytorch?

I am trying to implement a Siamese network with a ranking loss between two images. If I define my own loss would I be able to do the backpropagation step as follows? When I run it sometimes it seems to me that it is giving the same results that the single network gives.

with torch.set_grad_enabled(phase == 'train'):
    outputs1 = model(inputs1)
    outputs2 = model(inputs2)
    preds1 = outputs1;
    preds2 = outputs2;

    alpha = 0.02;
    w_r = torch.tensor(1).cuda(async=True);

    y_i, y_j, predy_i, predy_j = labels1,labels2,outputs1,outputs2;
    batchRankLoss =  torch.tensor([max(0,alpha - delta(y_i[i], y_j[i])*predy_i[i] - predy_j[i])) for i in range(batchSize)],dtype = torch.float)
    rankLossPrev = torch.mean(batchRankLoss)                                             
    rankLoss = Variable(rankLossPrev,requires_grad=True)

    loss1 = criterion(outputs1, labels1)
    loss2 = criterion(outputs2, labels2)


    #total loss = loss1 + loss2 + w_r*rankLoss
    totalLoss = torch.add(loss1,loss2)
    w_r = w_r.type(torch.LongTensor)
    rankLossPrev = rankLossPrev.type(torch.LongTensor)
    mult = torch.mul(w_r.type(torch.LongTensor),rankLossPrev).type(torch.FloatTensor)
    totalLoss = torch.add(totalLoss,mult.cuda(async = True));

     # backward + optimize only if in training phase
         if phase == 'train':
            totalLoss.backward()
            optimizer.step()

            running_loss += totalLoss.item() * inputs1.size(0)

Upvotes: 0

Views: 2353

Answers (2)

weiyixie
weiyixie

Reputation: 581

   rank_loss = torch.mean([torch.max(0,alpha - delta(y_i[i], y_j[i])*predy_i[i] - predy_j[i])) for i in range(batchSize)], dim=0)
   w_r = 1.0
   loss1 = criterion(outputs1, labels1)
   loss2 = criterion(outputs2, labels2)
   total_loss = loss1 + loss2 + w_r  * rank_loss 
   if phase == 'train':
       total_loss .backward()
       optimizer.step()

You don't have to create a tensor over and over again. If you have different weights for each loss and weights are just constants, you can simply write:

total_loss = weight_1 * loss1 + weight_2 * loss2 + weight_3  * rank_loss

This is untrainable constant anyway, it does not make sense to create A variable and set requires_grad to True because weights are just constants. Please upgrade to pytorch 0.4.1, in which you don't have to wrap everything with Variable

Upvotes: 0

twink_ml
twink_ml

Reputation: 512

You have several lines where you generate new Tensors from a constructor or a cast to another data type. When you do this, you disconnect the chain of operations through which you'd like the backwards() command to differentiate.

This cast disconnects the graph because casting is non-differentiable:

w_r = w_r.type(torch.LongTensor)

Building a Tensor from a constructor will disconnect the graph:

batchRankLoss = torch.tensor([max(0,alpha - delta(y_i[i], y_j[i])*predy_i[i] - predy_j[i])) for i in range(batchSize)],dtype = torch.float)

From the docs, wrapping a Tensor in a Variable will set the grad_fn to None (also disconnecting the graph):

rankLoss = Variable(rankLossPrev,requires_grad=True)

Assuming that your critereon function is differentiable, then gradients are currently flowing backward only through loss1 and loss2. Your other gradients will only flow as far as mult before they are stopped by a call to type(). This is consistent with your observation that your custom loss doesn't change the output of your neural network.

To allow gradients to flow backward through your custom loss, you'll have to code the same logic while avoiding type() casts and calculate rankLoss without using a list comprehension.

Upvotes: 1

Related Questions