nmospan
nmospan

Reputation: 139

pytorch: compute vector-Jacobian product for vector function

Good day!

I am trying to grasp torch.autograd basics. In particular I want to test this statement from https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html#sphx-glr-beginner-blitz-autograd-tutorial-py


enter image description here


So my idea is to construct a vector function, say:

(y_1; y_2; y_3) = (x_1*x_1 + x_2; x_2 * x_2 + x_3; x_3 * x_3)

Then count Jacobian matrix at point (1,1,1) and multiply it on vector (3, 5, 7).

Jacobian = (2x_1; 1. ; 0. ) (0. ; 2x_2 ; 1. ) (0. ; 0. ; 2x_3)

I am expecting result Jacobian(x=(1,1,1)) * v = (6+5, 10 + 7, 2 * 7) = (11, 17, 14).

Now below is my attempt to do it in pytorch:

import torch

x = torch.ones(3, requires_grad=True)
print(x)

y = torch.tensor([x[0]**2 + x [1], x[1]**2 + x[2], x[2]**2], requires_grad=True)
print(y)

v = torch.tensor([3, 5, 7])

y.backward(v)
x.grad

which gives not expected result (2., 2., 1.). I think I define tensor y in a wrong way. If I simply do y = x * 2, then gradient will work, but what about creating more complex tensor like in this case?

Thank you.

Upvotes: 5

Views: 1532

Answers (1)

charles yin
charles yin

Reputation: 89

You should not define tensor y by torch.tensor(), torch.tensor() is a tensor constructor, not an operator, so it is not trackable in the operation graph. You should use torch.stack() instead.

Just change that line to:

y = torch.stack((x[0]**2+x[1], x[1]**2+x[2], x[2]**2))

the result of x.grad should be tensor([ 6., 13., 19.])

Upvotes: 5

Related Questions