Eric Yu
Eric Yu

Reputation: 27

Using LinearOperator as nonlinear constraints in Scipy optimize

I'm trying to use the optimization module in SciPy to solve constrained optimization problem. I need to implement the 'hess' argument. In scipy's documentation and tutorial, their hessian are simply [[2, 0], [0, 0]] and [[2, 0], [0, 0]]. However, my hessian is something like [[(-24)*x[0]**2 + 48*x[0]-16, 0], [0, 0]] and [[(-48)*x[0]**2 + 192*x[0]-176, 0], [0, 0]] so that I cannot simply use numpy.array to do multiplication. It seems that I should send a LinearOperator object to the 'hess' arguement. Examples of using LinearOperator is unclear in both scipy.optimize tutorial and LinearOperator documentation since they only show examples of lower dimension. I'm wondering how to correctly use it?

The problem formulation is enter image description here

my code is:

import numpy as np
from scipy.optimize import Bounds
from scipy.optimize import NonlinearConstraint
from scipy.optimize import minimize

def f(x):
    return (-x[0]-x[1])
def grad(x):
    return np.array([-1, -1])
def hess(x):
    return np.array([[0, 0], [0, 0]])

def cons_f(x):
    return [(-2)*x[0]**4 + 8*x[0]**3 + (-8)*x[0]**2 + x[1] -2, (-4)*x[0]**4 + 32*x[0]**3 + (-88)*x[0]**2 + 96*x[0] + x[1] -36]
    
def cons_Jacobian(x):
    return [[(-8)*x[0]**3 + 24*x[0]**2 - 16*x[0], 1], [(-16)*x[0]**3 + 96*x[0]**2 - 176*x[0] +96, 1]]

def cons_Hessian(x,v):
    # TODO
    return v[0]*[[(-24)*x[0]**2 + 48*x[0]-16, 0], [0, 0]] + v[1]*[[(-48)*x[0]**2 + 192*x[0]-176, 0], [0, 0]]

nonlinear_constraint = NonlinearConstraint(cons_f, -np.inf, 0, jac=cons_Jacobian, hess=cons_Hessian)

bounds = Bounds([0, 0], [3.0, 4.0])
x0 = np.array([0.5, 1])
res = minimize(f, x0, method='trust-constr', jac=grad, hess=hess,
               constraints=[nonlinear_constraint],bounds=bounds)

The cons_Hessian(x,v)is absolutely wrong in my code.

In their example, although hessians are simply[[2, 0], [0, 0]] and [[2, 0], [0, 0]], the usage is confusing. I don't understand where p comes in.

from scipy.sparse.linalg import LinearOperator
def cons_H_linear_operator(x, v):
    def matvec(p):
        return np.array([p[0]*2*(v[0]+v[1]), 0])
    return LinearOperator((2, 2), matvec=matvec)
nonlinear_constraint = NonlinearConstraint(cons_f, -np.inf, 1,
                                          jac=cons_J, hess=cons_H_linear_operator)

Upvotes: 0

Views: 215

Answers (1)

joni
joni

Reputation: 7157

There's no need to use a LinearOperator. You only need to ensure that cons_f, cons_Jacobian and cons_Hessian return np.ndarrays. That's the reason why you can't evaluate your cons_Hessian. Additionally, it's highly recommended to use double literals instead of integers, i.e. -2.0 instead of 2 to prevent that the function returns np.ndarrays with a integer dtype.

Your example works for me by writing these functions as follows:

def cons_f(x):
    con1 = (-2.0)*x[0]**4 + 8*x[0]**3 + (-8)*x[0]**2 + x[1] - 2
    con2 = (-4)*x[0]**4 + 32*x[0]**3 + (-88)*x[0]**2 + 96*x[0] + x[1] -36
    return np.array([con1, con2])
    
def cons_Jacobian(x):
    con1_grad = [(-8.0)*x[0]**3 + 24*x[0]**2 - 16*x[0], 1]
    con2_grad = [(-16)*x[0]**3 + 96*x[0]**2 - 176*x[0] +96, 1]
    return np.array([con1_grad, con2_grad])

def cons_Hessian(x,v):
    con1_hess = np.array([[(-24.0)*x[0]**2 + 48*x[0]-16, 0], [0, 0]])
    con2_hess = np.array([[(-48)*x[0]**2 + 192*x[0]-176, 0], [0, 0]])
    return v[0]*con1_hess + v[1]*con2_hess

Upvotes: 1

Related Questions