Reputation: 1665
I have a simple convex problem I am trying to speed up the solution of. I am solving the argmin (theta) of
where theta and rt is Nx1.
I can solve this easily with cvxpy
import numpy as np
from scipy.optimize import minimize
import cvxpy
np.random.seed(123)
T = 50
N = 5
R = np.random.uniform(-1, 1, size=(T, N))
cvtheta = cvxpy.Variable(N)
fn = -sum([cvxpy.log(1 + cvtheta.T * rt) for rt in R])
prob = cvxpy.Problem(cvxpy.Minimize(fn))
prob.solve()
prob.status
#'optimal'
prob.value
# -5.658335088091929
cvtheta.value
# matrix([[-0.82105079],
# [-0.35475695],
# [-0.41984643],
# [ 0.66117397],
# [ 0.46065358]])
But for a larger R
this gets too slow, so I am trying a gradient based method with scipy
's fmin_cg
:
goalfun
is a scipy.minimize
friendly function that returns the function value and the gradient.
def goalfun(theta, *args):
R = args[0]
N = R.shape[1]
common = (1 + np.sum(theta * R, axis=1))**-1
if np.any( common < 0 ):
return 1e2, 1e2 * np.ones(N)
fun = np.sum(np.log(common))
thetaprime = np.tile(theta, (N, 1)).T
np.fill_diagonal(thetaprime, np.ones(N))
grad = np.sum(np.dot(R, thetaprime) * common[:, None], axis=0)
return fun, grad
Making sure the function and gradients are correct:
goalfun(np.squeeze(np.asarray(cvtheta.value)), R)
# (-5.6583350819293603,
# array([ -9.12423065e-09, -3.36854633e-09, -1.00983679e-08,
# -1.49619901e-08, -1.22987872e-08]))
But solving this just yields garbage, regardless of method
, iterations, etc. (The only things that yields Optimization terminated successfully
is if x0
is practically equal to the optimal theta)
x0 = np.random.rand(R.shape[1])
minimize(fun=goalfun, x0=x0, args=R, jac=True, method='CG')
# fun: 3.3690101669818775
# jac: array([-11.07449021, -14.04017873, -13.38560561, -5.60375334, -2.89210078])
# message: 'Desired error not necessarily achieved due to precision loss.'
# nfev: 25
# nit: 1
# njev: 13
# status: 2
# success: False
# x: array([ 0.00892177, 0.24404118, 0.51627475, 0.21119326, -0.00831957])
I.e. this seemingly innocuous problem that cvxpy
handles with ease, turns out to be completely pathological for a non-convex solver. Is this problem really that nasty, or am I missing something? What would be an alternative to speed this up?
Upvotes: 1
Views: 627
Reputation: 5521
I believe the issue is that it is possible for theta
to be such that the log
argument becomes negative. It seems that you have identified this issue, and have goalfun
return the tuple (100,100*ones(N))
in this case, apparently, as a heuristic attempt to suggest the solver that this "solution" is not preferable. However, a stronger condition must be imposed, i.e., this "solution" is not feasible. Of course, this can be done by providing appropriate constraints. (Interestingly, cvxpy
appears to handle this issue automatically.)
Here is a sample run, without bothering with providing derivatives. Note the use of a feasible initial estimate x0
.
np.random.seed(123)
T = 50
N = 5
R = np.random.uniform(-1, 1, size=(T, N))
def goalfun(theta, *args):
R = args[0]
N = R.shape[1]
common = (1 + np.sum(theta * R, axis=1))**-1
return np.sum(np.log(common))
def con_fun(theta, *args):
R = args[0]
return 1+np.sum(theta * R, axis=1)
cons = ({'type': 'ineq', 'fun': lambda x: con_fun(x, R)})
x0 = np.zeros(R.shape[1])
minimize(fun=goalfun, x0=x0, args=R, constraints=cons)
fun: -5.658334806882614 jac: array([ 0.0019, -0.0004, -0.0003, 0.0005, -0.0015, 0. ]) message: 'Optimization terminated successfully.' nfev: 92 nit: 12 njev: 12 status: 0 success: True x: array([-0.8209, -0.3547, -0.4198, 0.6612, 0.4605])
Note that when I run this, I get an invalid value encountered in log
warning, indicating that at some point in the search a value of theta
is checked which barely satisfies the constraints. However, the result is reasonably close to that of cvxpy
. It would be interesting to check if the cvxpy
solution changes when the constraints are explicitly imposed in the cvxpy.Problem
formulation.
Upvotes: 2