Juan Leni
Juan Leni

Reputation: 7618

Is there a mistake in pytorch tutorial?

The official pytorch tutorial (https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html#gradients) indicates that out.backward() and out.backward(torch.tensor(1)) are equivalent. But this does not seem to be the case.

import torch

x = torch.ones(2, 2, requires_grad=True)
y = x + 2
z = y * y * 3
out = z.mean()

# option 1    
out.backward()

# option 2. Replace! do not leave one after the other
# out.backward(torch.tensor(1))

print(x.grad)

Using option 2 (commented out) results in an error.

Note: do not leave two backward calls. Replace option 1 with 2.

Is the tutorial out of date? What is the purpose of the argument?

Update If I use out.backward(torch.tensor(1)) as the tutorial says, I get:

E       RuntimeError: invalid gradient at index 0 - expected type torch.FloatTensor but got torch.LongTensor

../../../anaconda3/envs/phd/lib/python3.6/site-packages/torch/autograd/__init__.py:90: RuntimeError

I tried also using out.backward(torch.Tensor(1)) and I get instead:

E       RuntimeError: invalid gradient at index 0 - expected shape [] but got [1]

../../../anaconda3/envs/phd/lib/python3.6/site-packages/torch/autograd/__init__.py:90: RuntimeError

Upvotes: 0

Views: 658

Answers (1)

MBT
MBT

Reputation: 24119

You need to use dtype=torch.float:

import torch

x = torch.ones(2, 2, requires_grad=True)
y = x + 2
z = y * y * 3
out = z.mean()

# option 1    
out.backward()
print(x.grad)


x = torch.ones(2, 2, requires_grad=True)
y = x + 2
z = y * y * 3
out = z.mean()



#option 2. Replace! do not leave one after the other
out.backward(torch.tensor(1, dtype=torch.float))

print(x.grad)

Output:

tensor([[ 4.5000,  4.5000],
        [ 4.5000,  4.5000]])
tensor([[ 4.5000,  4.5000],
        [ 4.5000,  4.5000]])

Upvotes: 2

Related Questions