Mr.Robot
Mr.Robot

Reputation: 349

Disparate result after setting requires_grad=True

I am currently using PyTorch for deep neural network. I wrote a toy neural network shown below and I found that whether or not I set requires_grad=True for label y makes huge difference. When y.requires_grad=True, the neural network diverges. I am wondering why this happens.

import torch

x = torch.unsqueeze(torch.linspace(-1, 1, 100), dim=1)
y = x.pow(2) + 10 * torch.rand(x.size())


x.requires_grad = True
# this is where problem occurs
y.requires_grad = True

class Net(torch.nn.Module):
    def __init__(self, n_feature, n_hidden, n_output):
        super(Net, self).__init__()
        self.hidden = torch.nn.Linear(n_feature, n_hidden)
        self.predict = torch.nn.Linear(n_hidden, n_output)

    def forward(self, x):
        x = torch.relu(self.hidden(x))
        x = self.predict(x)
        return x

net = Net(1, 10, 1)
optimizer = torch.optim.SGD(net.parameters(), lr=0.5)
criterion = torch.nn.MSELoss()


for t in range(200):
    y_pred = net(x)

    loss= criterion(y_pred, y)

    optimizer.zero_grad()
    loss.backward()
    print("Epoch {}: {}".format(t, loss))
    optimizer.step()

Upvotes: 0

Views: 339

Answers (1)

dennlinger
dennlinger

Reputation: 11490

It seems that you are using an outdated version of PyTorch. In more recent versions (0.4.0+), this will throw you the following error:

AssertionError: nn criterions don't compute the gradient w.r.t. targets - 
                please mark these tensors as not requiring gradients

Essentially, it tells you that it will only work if you set the requires_grad flag to False for your targets. The reason why this works at all in prior versions is indeed very interesting, and also why it causes diverging behavior.

My guess would be that a backwards pass would then also change your targets (instead of only changing your weights), which is obviously something you do not desire.

Upvotes: 1

Related Questions