Reputation: 7618
The official pytorch tutorial (https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html#gradients) indicates that out.backward()
and out.backward(torch.tensor(1))
are equivalent. But this does not seem to be the case.
import torch
x = torch.ones(2, 2, requires_grad=True)
y = x + 2
z = y * y * 3
out = z.mean()
# option 1
out.backward()
# option 2. Replace! do not leave one after the other
# out.backward(torch.tensor(1))
print(x.grad)
Using option 2 (commented out) results in an error.
Note: do not leave two backward calls. Replace option 1 with 2.
Is the tutorial out of date? What is the purpose of the argument?
Update
If I use out.backward(torch.tensor(1))
as the tutorial says, I get:
E RuntimeError: invalid gradient at index 0 - expected type torch.FloatTensor but got torch.LongTensor
../../../anaconda3/envs/phd/lib/python3.6/site-packages/torch/autograd/__init__.py:90: RuntimeError
I tried also using out.backward(torch.Tensor(1))
and I get instead:
E RuntimeError: invalid gradient at index 0 - expected shape [] but got [1]
../../../anaconda3/envs/phd/lib/python3.6/site-packages/torch/autograd/__init__.py:90: RuntimeError
Upvotes: 0
Views: 658
Reputation: 24119
You need to use dtype=torch.float
:
import torch
x = torch.ones(2, 2, requires_grad=True)
y = x + 2
z = y * y * 3
out = z.mean()
# option 1
out.backward()
print(x.grad)
x = torch.ones(2, 2, requires_grad=True)
y = x + 2
z = y * y * 3
out = z.mean()
#option 2. Replace! do not leave one after the other
out.backward(torch.tensor(1, dtype=torch.float))
print(x.grad)
Output:
tensor([[ 4.5000, 4.5000],
[ 4.5000, 4.5000]])
tensor([[ 4.5000, 4.5000],
[ 4.5000, 4.5000]])
Upvotes: 2