Reputation: 2634
I do a lot of simulations, for which I often need to minimise complicated user-defined functions, for which I generally use numpy
and scipy.optimize.minimize()
. However, the problem with this is that I need to explicitly write down a gradient function, which can sometimes be very difficult/impossible, to find. And for large dimensional vectors, the numerical derivatives calculated by scipy
are prohibitively expensive.
So, I am trying to switch to Tensorflow
or PyTorch
to take advantage both of their automatic differentiation capabilities, and to be able to exploit GPUs freely. Let me give an explicit example of a function who derivative is somewhat complicated to write down (would required a lot of chain rule), and thus seems ripe for Tensorflow
or PyTorch
-- calculating the dihedral angle between the two triangles formed by four points in 3d space:
def dihedralAngle(xyz):
## calculate dihedral angle between 4 nodes
p1, p2, p3, p4 = 0, 1, 2, 3
## get unit normal vectors
N1 = np.cross(xyz[p1]-xyz[p3] , xyz[p2]-xyz[p3])
N2 = - np.cross(xyz[p1]-xyz[p4] , xyz[p2]-xyz[p4])
n1, n2 = N1 / np.linalg.norm(N1), N2 / np.linalg.norm(N2)
angle = np.arccos(np.dot(n1, n2))
return angle
xyz1 = np.array([[0.2 , 0. , 0. ],
[0.198358 , 0.02557543, 0. ],
[0.19345897, 0.05073092, 0. ],
[0.18538335, 0.0750534 , 0. ]]) # or any (4,3) ndarray
print(dihedralAngle(xyz1)) >> 3.141
I could easily minimise this using scipy.optimize.minimize()
, and I should get 0. For such a small function, I don't really the need a gradient (explicit or numerical). However, if I wish to iterate over many, many nodes, and minimise some function that depends on all the dihedral angles, then the overhead is much higher?
My questions then --
TensorFlow
or PyTorch
? Both for a single dihedral angle, and for a list of such angles (i.e. we need to account for looping over lists).scipy.optimize.minimize()
if desired? For example, scipy.optimize.minimize()
allows easily for bounds and constraints, something that I haven't noticed in the Tensorflow or PyToch optimisation modules.Upvotes: 3
Views: 1563
Reputation: 101
Here is a solution using automatic computation of the gradient by torch and then the minimizer of scipy with a lib I wrote autograd-minimize. The advantage over SGD is a better precision of the estimation (using second order methods). It is probably equivalent to using LBFGS from torch:
import numpy as np
import torch
from autograd_minimize import minimize
def dihedralAngle(xyz):
## calculate dihedral angle between 4 nodes
p1, p2, p3, p4 = 0, 1, 2, 3
## get unit normal vectors
N1 = np.cross(xyz[p1]-xyz[p3] , xyz[p2]-xyz[p3])
N2 = - np.cross(xyz[p1]-xyz[p4] , xyz[p2]-xyz[p4])
n1, n2 = N1 / np.linalg.norm(N1), N2 / np.linalg.norm(N2)
angle = np.arccos(np.dot(n1, n2))
return angle
def compute_angle(p1, p2):
# inner_product = torch.dot(p1, p2)
inner_product = (p1*p2).sum(-1)
p1_norm = torch.linalg.norm(p1, axis=-1)
p2_norm = torch.linalg.norm(p2, axis=-1)
cos = inner_product / (p1_norm * p2_norm)
cos = torch.clamp(cos, -0.99999, 0.99999)
angle = torch.acos(cos)
return angle
def compute_dihedral(v1,v2,v3,v4):
ab = v1 - v2
cb = v3 - v2
db = v4 - v3
u = torch.cross(ab, cb)
v = torch.cross(db, cb)
w = torch.cross(u, v)
angle = compute_angle(u, v)
angle = torch.where(compute_angle(cb, w) > 1, -angle, angle)
return angle
def loss_func(v1,v2,v3,v4):
return ((compute_dihedral(v1,v2,v3,v4)+2)**2).mean()
x0=[np.array([-17.0490, 5.9270, 21.5340]),
np.array([-0.1608, 0.0600, -0.0371]),
np.array([-0.2000, 0.0007, -0.0927]),
np.array([-0.1423, 0.0197, -0.0727])]
res = minimize(loss_func, x0, backend='torch')
print(compute_dihedral(*[torch.tensor(v) for v in res.x]))
Upvotes: 1
Reputation: 121
I'm working on the same thing. Here is what I got.
def compute_angle(p1, p2):
# inner_product = torch.dot(p1, p2)
inner_product = (p1*p2).sum(-1)
p1_norm = torch.linalg.norm(p1, axis=-1)
p2_norm = torch.linalg.norm(p2, axis=-1)
cos = inner_product / (p1_norm * p2_norm)
cos = torch.clamp(cos, -0.99999, 0.99999)
angle = torch.acos(cos)
return angle
def compute_dihedral(v1,v2,v3,v4):
ab = v1 - v2
cb = v3 - v2
db = v4 - v3
u = torch.cross(ab, cb)
v = torch.cross(db, cb)
w = torch.cross(u, v)
angle = compute_angle(u, v)
# angle = torch.where(compute_angle(cb, w) > 0.001, -angle, angle)
angle = torch.where(compute_angle(cb, w) > 1, -angle, angle)
# try:
# if compute_angle(cb, w) > 0.001:
# angle = -angle
# except ZeroDivisionError:
# # dihedral=pi
# pass
return angle
v1 = torch.tensor([-17.0490, 5.9270, 21.5340], requires_grad=True)
v2 = torch.tensor([-0.1608, 0.0600, -0.0371], requires_grad=True)
v3 = torch.tensor([-0.2000, 0.0007, -0.0927], requires_grad=True)
v4 = torch.tensor([-0.1423, 0.0197, -0.0727], requires_grad=True)
dihedral = compute_dihedral(v1,v2,v3,v4)
target_dihedral = -2
print(dihedral) # should print -2.6387
for i in range(100):
dihedral = compute_dihedral(v1,v2,v3,v4)
loss = (dihedral - target_dihedral)**2
loss.backward()
learning_rate = 0.001
with torch.no_grad():
v1 -= learning_rate * v1.grad
v2 -= learning_rate * v2.grad
v3 -= learning_rate * v3.grad
v4 -= learning_rate * v4.grad
# Manually zero the gradients after updating weights
v1.grad = None
v2.grad = None
v3.grad = None
v4.grad = None
print(compute_dihedral(v1,v2,v3,v4)) # should print -2
Upvotes: 0