Reputation: 675
In pytorch
what is the meaning of y.backward([0.1, 1.0, 0.0001])
?
I understand that y.backward()
means doing backpropagation.
But what is the meaning of [0.1, 1.0, 0.0001]
in y.backward([0.1, 1.0, 0.0001])
?
Upvotes: 4
Views: 1238
Reputation: 395
This is the autograd when output is a vector. We cannot get gradient of a vector, we need convert this vector into a scaler. The gradient argument is the weight to convert this vector into a scaler.
For example, input: x = [x1,x2,x3] and the operation: y = 2*x = [2*x1,2*x2,2*x3]
Then, we cannot obtain dy/dx. If we have y.backward(torch.tensor([0.1,1,0.001])), it means we have another variable: output=torch.sum(y*[0.1,1,0.001])=0.2*x1+2*x2+0.002*x3.
Then, we can get d(out)/dx and d(out)/dx will be stored in x.grad. In our example, x.grad = [d(out)/dx1, d(out)/dx2, d(out)/dx3] = [0.2, 2, 0.002].
Upvotes: 0
Reputation: 51
First, it's not y.backward([0.1, 1.0, 0.0001]
since in pyTorch, any arguments should be Tensor
. So the right one should be y.backward(torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float))
.
Use the link here to check the documents autograd
Secondly, this torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)
create a 1-d tensor with 3 elements in it. The code y.backward(torch.tensor([0.1, 1.0, 0.0001])
is actually computing a vector product with the y
.
Upvotes: 0
Reputation: 37691
The expression y.backward([0.1, 1.0, 0.0001])
is actually wrong. It should be y.backward(torch.Tensor([0.1, 1.0, 0.0001]))
, where torch.Tensor([0.1, 1.0, 0.0001])
are the Variables of which the derivative will be computed.
Example:
x = Variable(torch.ones(2, 2), requires_grad=True)
y = (x + 2).mean()
y.backward(torch.Tensor([1.0]))
print(x.grad)
Here, y = (x + 2)/4
and so, dy/dx_i = 0.25
since x_i = 1.0
. Also note, y.backward(torch.Tensor([1.0]))
and y.backward()
are equivalent.
If you do:
y.backward(torch.Tensor([0.1]))
print(x.grad)
it prints:
Variable containing:
1.00000e-02 *
2.5000 2.5000
2.5000 2.5000
[torch.FloatTensor of size 2x2]
It is simply 0.1 * 0.25 = 0.025
. So, now if you compute:
y.backward(torch.Tensor([0.1, 0.01]))
print(x.grad)
Then it prints:
Variable containing:
1.00000e-02 *
2.5000 0.2500
2.5000 0.2500
[torch.FloatTensor of size 2x2]
Where, dy/dx_11 = dy/d_x21 = 0.025
and dy/dx_12 = dy/d_x22 = 0.0025
.
See the function prototype of backward(). You may consider looking into this example.
Upvotes: 2