Reputation: 152
I am interested in combined derivatives using Pytorch:
In the implemented code below, I have tried, but the code compute two partial derivative (e.g. it computed firstly d'f/d'x and secondly d'f/d'y). Is it possible modify the code in some way that we can compute this derivative with respect two parameters?
import torch
def function(x,y):
f = x**3+y**3
return f
a = torch.tensor([4., 5., 6.], requires_grad=True)
b = torch.tensor([1., 2., 6.], requires_grad=True)
derivative = torch.autograd.functional.jacobian(function, (a,b))
print(derivative)
Thanks in advance!
Upvotes: 2
Views: 429
Reputation: 40628
You can use torch.autograd.functional.hessian to get the combined derivatives.
>>> f = lambda x, y: (x**3 + y**3).mean()
>>> H = A.hessian(f, (a, b))
Since you have two inputs, the result will be a tuple containing 2 tuples.
More precisely, you will have
H[0][0]
the 2nd derivative w.r.t x
: d²z_i/dx_j*dx_j
H[0][1]
the combined derivative w.r.t x
and y
: d²z_i/dx_j*dy_j
H[0][1]
the combined derivative w.r.t y
and x
: d²z_i/dy_j*dx_j
H[1][1]
the 2nd derivative w.r.t y
: d²z_i/dy_j*dy_j
>>> H
((tensor([[ 8., 0., 0.],
[ 0., 10., 0.],
[ 0., 0., 12.]],
tensor([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]]),
(tensor([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]]))
tensor([[ 2., 0., 0.],
[ 0., 4., 0.],
[ 0., 0., 12.]])
Indeed if you look at the combined derivative: d²(x³+y³)/dxdy = d(3x²)/dy = 0
, hence H[0][1]
and H[1][0]
are zero matrices.
On the other hand we have d²x³/d²x = 6x
, since the f
is averaging the values, it gives 6x/3 = 2x
. Similarly, you get d²x³/d²y = 6y
.
As a result, you find that H[0][0] = diag(2a)
and H[1][1] = diag(2b)
.
Upvotes: 3