jung hyemin
jung hyemin

Reputation: 675

In pytorch, the meaning of y.backward([0.1, 1.0, 0.0001])

In pytorch what is the meaning of y.backward([0.1, 1.0, 0.0001])?

I understand that y.backward() means doing backpropagation. But what is the meaning of [0.1, 1.0, 0.0001] in y.backward([0.1, 1.0, 0.0001])?

Upvotes: 4

Views: 1238

Answers (3)

Zhihui Shao
Zhihui Shao

Reputation: 395

This is the autograd when output is a vector. We cannot get gradient of a vector, we need convert this vector into a scaler. The gradient argument is the weight to convert this vector into a scaler.

For example, input: x = [x1,x2,x3] and the operation: y = 2*x = [2*x1,2*x2,2*x3]

Then, we cannot obtain dy/dx. If we have y.backward(torch.tensor([0.1,1,0.001])), it means we have another variable: output=torch.sum(y*[0.1,1,0.001])=0.2*x1+2*x2+0.002*x3.

Then, we can get d(out)/dx and d(out)/dx will be stored in x.grad. In our example, x.grad = [d(out)/dx1, d(out)/dx2, d(out)/dx3] = [0.2, 2, 0.002].

Upvotes: 0

Brandonnnn
Brandonnnn

Reputation: 51

First, it's not y.backward([0.1, 1.0, 0.0001] since in pyTorch, any arguments should be Tensor. So the right one should be y.backward(torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)). Use the link here to check the documents autograd

Secondly, this torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float) create a 1-d tensor with 3 elements in it. The code y.backward(torch.tensor([0.1, 1.0, 0.0001]) is actually computing a vector product with the y.

Upvotes: 0

Wasi Ahmad
Wasi Ahmad

Reputation: 37691

The expression y.backward([0.1, 1.0, 0.0001]) is actually wrong. It should be y.backward(torch.Tensor([0.1, 1.0, 0.0001])), where torch.Tensor([0.1, 1.0, 0.0001]) are the Variables of which the derivative will be computed.


Example:

x = Variable(torch.ones(2, 2), requires_grad=True)
y = (x + 2).mean()
y.backward(torch.Tensor([1.0]))
print(x.grad)

Here, y = (x + 2)/4 and so, dy/dx_i = 0.25 since x_i = 1.0. Also note, y.backward(torch.Tensor([1.0])) and y.backward() are equivalent.

If you do:

y.backward(torch.Tensor([0.1]))
print(x.grad)

it prints:

Variable containing:
1.00000e-02 *
  2.5000  2.5000
  2.5000  2.5000
[torch.FloatTensor of size 2x2]

It is simply 0.1 * 0.25 = 0.025. So, now if you compute:

y.backward(torch.Tensor([0.1, 0.01]))
print(x.grad)

Then it prints:

Variable containing:
1.00000e-02 *
  2.5000  0.2500
  2.5000  0.2500
[torch.FloatTensor of size 2x2]

Where, dy/dx_11 = dy/d_x21 = 0.025 and dy/dx_12 = dy/d_x22 = 0.0025.

See the function prototype of backward(). You may consider looking into this example.

Upvotes: 2

Related Questions