luffe
luffe

Reputation: 1665

Fast optimization of "pathological" convex function

I have a simple convex problem I am trying to speed up the solution of. I am solving the argmin (theta) of

eq

where theta and rt is Nx1.

I can solve this easily with cvxpy

import numpy as np
from scipy.optimize import minimize
import cvxpy

np.random.seed(123)

T = 50
N = 5
R = np.random.uniform(-1, 1, size=(T, N))

cvtheta = cvxpy.Variable(N)
fn = -sum([cvxpy.log(1 + cvtheta.T * rt) for rt in R])

prob = cvxpy.Problem(cvxpy.Minimize(fn))
prob.solve()

prob.status
#'optimal'

prob.value
# -5.658335088091929

cvtheta.value
# matrix([[-0.82105079],
#         [-0.35475695],
#         [-0.41984643],
#         [ 0.66117397],
#         [ 0.46065358]])

But for a larger R this gets too slow, so I am trying a gradient based method with scipy's fmin_cg:

goalfun is a scipy.minimize friendly function that returns the function value and the gradient.

def goalfun(theta, *args):
    R = args[0]
    N = R.shape[1]
    common = (1 + np.sum(theta * R, axis=1))**-1

    if np.any( common < 0 ):
        return 1e2, 1e2 * np.ones(N)

    fun = np.sum(np.log(common))

    thetaprime = np.tile(theta, (N, 1)).T
    np.fill_diagonal(thetaprime, np.ones(N))
    grad = np.sum(np.dot(R, thetaprime) * common[:, None], axis=0)

    return fun, grad

Making sure the function and gradients are correct:

goalfun(np.squeeze(np.asarray(cvtheta.value)), R)
# (-5.6583350819293603,
#  array([ -9.12423065e-09,  -3.36854633e-09,  -1.00983679e-08,
#          -1.49619901e-08,  -1.22987872e-08]))

But solving this just yields garbage, regardless of method, iterations, etc. (The only things that yields Optimization terminated successfully is if x0 is practically equal to the optimal theta)

x0 = np.random.rand(R.shape[1])

minimize(fun=goalfun, x0=x0, args=R, jac=True, method='CG')
#   fun: 3.3690101669818775
#      jac: array([-11.07449021, -14.04017873, -13.38560561,  -5.60375334,  -2.89210078])
#  message: 'Desired error not necessarily achieved due to precision loss.'
#     nfev: 25
#      nit: 1
#     njev: 13
#   status: 2
#  success: False
#        x: array([ 0.00892177,  0.24404118,  0.51627475,  0.21119326, -0.00831957])

I.e. this seemingly innocuous problem that cvxpy handles with ease, turns out to be completely pathological for a non-convex solver. Is this problem really that nasty, or am I missing something? What would be an alternative to speed this up?

Upvotes: 1

Views: 627

Answers (1)

Stelios
Stelios

Reputation: 5521

I believe the issue is that it is possible for theta to be such that the log argument becomes negative. It seems that you have identified this issue, and have goalfun return the tuple (100,100*ones(N)) in this case, apparently, as a heuristic attempt to suggest the solver that this "solution" is not preferable. However, a stronger condition must be imposed, i.e., this "solution" is not feasible. Of course, this can be done by providing appropriate constraints. (Interestingly, cvxpy appears to handle this issue automatically.)

Here is a sample run, without bothering with providing derivatives. Note the use of a feasible initial estimate x0.

np.random.seed(123)

T = 50
N = 5
R = np.random.uniform(-1, 1, size=(T, N))

def goalfun(theta, *args):
    R = args[0]
    N = R.shape[1]
    common = (1 + np.sum(theta * R, axis=1))**-1

    return np.sum(np.log(common))

def con_fun(theta, *args):
    R = args[0]

    return 1+np.sum(theta * R, axis=1)


cons = ({'type': 'ineq', 'fun': lambda x: con_fun(x, R)})

x0 = np.zeros(R.shape[1])
minimize(fun=goalfun, x0=x0, args=R, constraints=cons)
 fun: -5.658334806882614
 jac: array([ 0.0019, -0.0004, -0.0003,  0.0005, -0.0015,  0.    ])  message: 'Optimization terminated successfully.'
nfev: 92
 nit: 12
njev: 12   status: 0  success: True
   x: array([-0.8209, -0.3547, -0.4198,  0.6612,  0.4605])

Note that when I run this, I get an invalid value encountered in log warning, indicating that at some point in the search a value of theta is checked which barely satisfies the constraints. However, the result is reasonably close to that of cvxpy. It would be interesting to check if the cvxpy solution changes when the constraints are explicitly imposed in the cvxpy.Problem formulation.

Upvotes: 2

Related Questions