Derivative in both arguments of torch.nn.BCELoss()

Question

When using a torch.nn.BCELoss() on two arguments that are both results of some earlier computation, I get some curious error, which this question is about:

RuntimeError: the derivative for 'target' is not implemented

The MCVE is as follows:

import torch
import torch.nn.functional as F

net1 = torch.nn.Linear(1,1)
net2 = torch.nn.Linear(1,1)
loss_fcn = torch.nn.BCELoss()

x = torch.zeros((1,1))

y = F.sigmoid(net1(x)) #make sure y is in range (0,1)
z = F.sigmoid(net2(y)) #make sure z is in range (0,1)

loss = loss_fcn(z, y) #works if we replace y with y.detach()

loss.backward()

It turns out if we call .detach() on y the error disappears. But this results in a different computation, now in the .backward()-pass, the gradients with respect to the second argument of the BCELoss will not be computed.

Can anyone explain what I'm doing wrong in this case? As far as I know all pytorch modules in torch.nn should support computing gradients. And this error message seems to tell me that the derivative is not implemented for y, which is somehow strange, as you can compute the gradient of y, but not of y.detach() which seems to be contradictory.

flawr · Accepted Answer

It seems I misunderstood the error message. It is not y that doesn't allow the computation for gradients, it is BCELoss() that doesn't have the ability to compute gradients with respect to the second argument. A similar problem was discussed here.

Derivative in both arguments of torch.nn.BCELoss()

Answers (2)

Related Questions