Reputation: 537
I recently study PyTorch and backward()
.
I understood how to use it, but when I try:
x = Variable(2*torch.ones(2, 2), requires_grad=True)
x.backward(x)
print(x.grad)
I expect:
tensor([[1., 1.],
[1., 1.]])
because it is an identity function. However, it returns:
tensor([[2., 2.],
[2., 2.]]).
Why this happens?
Upvotes: 10
Views: 10201
Reputation: 1
I think you misunderstand how to use tensor.backward()
. The parameter inside the backward()
is not the x of dy/dx.
For example, if y is got from x by some operation, then y.backward(w)
, firstly pytorch will get l = dot(y,w)
, then calculate the dl/dx
.
So for your code, l = 2x
is calculated by pytorch firstly, then dl/dx
is what your code returns.
When you do y.backward(w)
, Just make the parameter of backward()
full of 1s if y is not a scalar; otherwise just no parameter.
Upvotes: 0
Reputation: 8699
Actually, this is what you are looking for:
Case 1: when z = 2*x**3 + x
import torch
from torch.autograd import Variable
x = Variable(2*torch.ones(2, 2), requires_grad=True)
z = x*x*x*2+x
z.backward(torch.ones_like(z))
print(x.grad)
output:
tensor([[25., 25.],
[25., 25.]])
Case 2: when z = x*x
x = Variable(2*torch.ones(2, 2), requires_grad=True)
z = x*x
z.backward(torch.ones_like(z))
print(x.grad)
output:
tensor([[4., 4.],
[4., 4.]])
Case 3: when z = x (your case)
x = Variable(2*torch.ones(2, 2), requires_grad=True)
z = x
z.backward(torch.ones_like(z))
print(x.grad)
output:
tensor([[1., 1.],
[1., 1.]])
To learn more how to calculate gradient in pytorch, check this.
Upvotes: 1