John_maddon
John_maddon

Reputation: 152

Pytorch - How differentiate wrt two parameters

I am interested in combined derivatives using Pytorch: enter image description here

In the implemented code below, I have tried, but the code compute two partial derivative (e.g. it computed firstly d'f/d'x and secondly d'f/d'y). Is it possible modify the code in some way that we can compute this derivative with respect two parameters?

import torch
def function(x,y):
    f = x**3+y**3
    return f

a =  torch.tensor([4., 5., 6.], requires_grad=True)
b =  torch.tensor([1., 2., 6.], requires_grad=True)
derivative = torch.autograd.functional.jacobian(function, (a,b))
print(derivative)

Thanks in advance!

Upvotes: 2

Views: 429

Answers (1)

Ivan
Ivan

Reputation: 40628

You can use torch.autograd.functional.hessian to get the combined derivatives.

>>> f = lambda x, y: (x**3 + y**3).mean()
>>> H = A.hessian(f, (a, b))

Since you have two inputs, the result will be a tuple containing 2 tuples.

More precisely, you will have

  • H[0][0] the 2nd derivative w.r.t x: d²z_i/dx_j*dx_j

  • H[0][1] the combined derivative w.r.t x and y: d²z_i/dx_j*dy_j

  • H[0][1] the combined derivative w.r.t y and x: d²z_i/dy_j*dx_j

  • H[1][1] the 2nd derivative w.r.t y: d²z_i/dy_j*dy_j


>>> H
((tensor([[ 8.,  0.,  0.],
          [ 0., 10.,  0.],
          [ 0.,  0., 12.]], 
  tensor([[ 0.,  0.,  0.],
          [ 0.,  0.,  0.],
          [ 0.,  0.,  0.]]),
 (tensor([[ 0.,  0.,  0.],
          [ 0.,  0.,  0.],
          [ 0.,  0.,  0.]]))
  tensor([[ 2.,  0.,  0.],
          [ 0.,  4.,  0.],
          [ 0.,  0., 12.]])

Indeed if you look at the combined derivative: d²(x³+y³)/dxdy = d(3x²)/dy = 0, hence H[0][1] and H[1][0] are zero matrices.

On the other hand we have d²x³/d²x = 6x, since the f is averaging the values, it gives 6x/3 = 2x. Similarly, you get d²x³/d²y = 6y.

As a result, you find that H[0][0] = diag(2a) and H[1][1] = diag(2b).

Upvotes: 3

Related Questions