Norhther
Norhther

Reputation: 500

Minimizing function using scipy with constraints

I have the following constraints for the problem, I want to minimize the sum of squared differences of w_i, uw_i divided by SUM(uw) following these restrictions:

1. w_i is at maximum, ul
2. w_i is, at least, 0.05
3. The sum of all w for a sector code can not be bigger than 0.50

So basically I want to generate all w_i for each row, however, I dont know how to implement the third restriction with scipy.

With scipy.optimize.lsq_linear I can force the first two conditions with bound = (0.05, ul), but I don't know how to force the third one.

import pandas as pd
import scipy
import numpy as np
df = pd.read_csv("https://raw.githubusercontent.com/norhther/datasets/main/data(1).csv")
df = df.drop("Unnamed: 0", axis = 1)
df

Upvotes: 0

Views: 1642

Answers (2)

joni
joni

Reputation: 7157

You already got a working answer from @constantstranger. IMO, there's just one problem: it's quite slow. More precisely, it took more than a minute to solve the problem on my machine.

Therefore, some notes on what could be done in order to speed up the solver in the following:

  • Since Python has a noticeable overhead when calling functions, it's a good idea to implement all functions as fast as possible. For instance, evaluating one vectorial constraint function is faster than evaluating multiple scalar constraint functions.

  • At the moment, all derivatives (the objective gradient and the constraint Jacobian) are approximated by finite differences. This is a real bottleneck because each evaluation of the approximated derivative goes in hand with multiple objective/constraint function evaluations. Instead, it's highly recommended to provide the exact derivatives or use algorithmic differentiation.

  • Last but not least, scipy.optimize.minimize is only suited for small to mid-sized problems at least. If you are willing to use another package, you could use IPOPT, the state-of-the-art NLP solver. The cyipopt package provides a scipy-like interface, so it isn't hard switching from scipy.optimize.minimize.

Besides from that, your problem is a (convex) quadratic optimization problem and can be formulated as follows:

min f(w) s.t.  A*w <= 0.5, u_l <= w <= 0.05
with 
f(w) = (1/sum(u_w)) * ||w - u_w||^2_2 = (1/sum(u_w)) * (w'Iw - 2u_w'*w + u_w'u_w)

where A[i,j] = 1 if w[j] belongs to sector i and 0 otherwise.

Then, solving the problem with IPOPT (note that we pass the exact derivatives) looks like this:

import numpy as np
import pandas as pd
from cyipopt import minimize_ipopt

# dataframe
df = pd.read_csv("https://raw.githubusercontent.com/norhther/datasets/main/data(1).csv")
df = df.drop("Unnamed: 0", axis = 1)

# sectors
sectors = df["Sector Code"].unique()

# building the matrix A
A = np.zeros((sectors.size, len(df)))
for i, sec in enumerate(sectors):
    indices = df[df["Sector Code"] == sec].index.values
    A[i, indices] = 1

uw = df['uw'].values
uw_sum = uw.sum()

# objective
def obj(w):
    return np.sum((w - uw)**2) / uw_sum

# objective gradient
def grad(w):
    return (2*w - 2*uw) / uw_sum

# Linear Constraint A @ w <= 0.5  <=>  0.5 - A @ w >= 0
cons = [{'type': 'ineq', 'fun': lambda w: 0.5 - A @ w, 'jac': lambda w: -A}]

# variable bounds
bounds = [(u_i, 0.05) for u_i in df.ul.values]

# feasible initial guess
w0 = np.ones(len(df)) / len(df)

# solve the problem
res = minimize_ipopt(obj, x0=w0, jac=grad, bounds=bounds, constraints=cons)

print(res)

On my machine, this terminates in less than 2 seconds and yields


******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

     fun: 0.4306218505716169
    info: {'x': array([0.05      , 0.05      , 0.03128946, ..., 0.04103131, 0.05      ,
       0.03348038]), 'g': array([-3.51687688, -9.45217602, -7.88799127, -1.78825803, -1.86650095,
       -5.09092925, -2.11181422, -1.35485327, -1.15847276,  0.35      ]), 'obj_val': 0.4306218505716169, 'mult_g': array([-1.000000e+03, -1.000000e+03, -1.000000e+03, -1.000000e+03,
       -1.000000e+03, -1.000000e+03, -1.000000e+03, -1.000000e+03,
       -1.000000e+03, -2.857166e-09]), 'mult_x_L': array([1000.02960821, 1000.02197802, 1000.00000005, ..., 1000.00000011,
       1000.02728049, 1000.00000006]), 'mult_x_U': array([0.00000000e+00, 0.00000000e+00, 5.34457820e-08, ...,
       1.11498931e-07, 0.00000000e+00, 6.05340266e-08]), 'status': 2, 'status_msg': b'Algorithm converged to a point of local infeasibility. Problem may be infeasible.'}
 message: b'Algorithm converged to a point of local infeasibility. Problem may be infeasible.'
    nfev: 13
     nit: 9
    njev: 7
  status: 2
 success: False
       x: array([0.05      , 0.05      , 0.03128946, ..., 0.04103131, 0.05      ,
       0.03348038])
[Finished in 1.9s]

Upvotes: 1

constantstranger
constantstranger

Reputation: 9389

I think you are trying to do something like this:

import pandas as pd
import scipy
import numpy as np

'''
Minimize the sum of squared differences of w_i, uw_i divided by SUM(uw) following these restrictions:

Constraints:
1. w_i is at maximum, ul [NOTE: I think this should say 'minimum']
2. w_i is, at least, 0.05 [NOTE: I think this should say 'at most']
3. The sum of all w for a sector code can not be bigger than 0.50
'''

df = pd.read_csv("https://raw.githubusercontent.com/norhther/datasets/main/data(1).csv")
df = df.drop("Unnamed: 0", axis = 1)
print(df)

gb = df.groupby('Sector Code')['ul']
codeCounts = gb.count().to_list()
cumCounts = [0] + [sum(codeCounts[:i + 1]) for i in range(len(codeCounts))]
newIdx = []
for code, dfGp in gb:
    newIdx += list(dfGp.index)
df = df.reindex(newIdx)

# For each unique Sector Code, create constraint that 0.50 minus the sum of w for that code must be non-negative:
def foo(i, c):
    # return a closure that acts as a constraint for the i'th interval defined by c[i-1]:c[i]
    def bar(x):
        return 0.50 - sum(x[c[i-1]:c[i]])
    return bar
cons = [{'type': 'ineq', 'fun': foo(i, cumCounts)} for i in range(1, len(cumCounts))]

# Value of bounds argument to enforce ul <= w_i <= 0.05
bnds = tuple((ul_i, 0.05) for code, ul_group in gb for ul_i in ul_group)

# Initial guess
n = len(df.index)
w_i = np.ones(n) * (1 / n)

# The objective function to be minimized
uw_sum = df.uw.sum()
def fun(w):
    return (pd.Series(w) - df.uw).pow(2).sum() / uw_sum

# Optimize using scipy minimize() function
from scipy.optimize import minimize
res = minimize(fun, w_i, method='SLSQP', bounds=bnds, constraints=cons)
print(res)

df['w'] = res.x
df = df.reindex(range(len(df.index)))
print(df)

Explanation:

  • Use groupby() to get the row count for each unique Sector Code value and also to construct an index ordered by Sector Code, which we use to re-order the original input df
  • create a list of constraint dictionaries to be passed to the optimizer, one for each Sector Code, which will use python closures to constrain the sum of the corresponding solution elements to be <= 0.50
  • create a sequence of bounds to constrain solution elements w_i to be between ul and 0.05
  • create the objective function to return the sum of squared differences of w_i, uw_i divided by sum(uw)
  • call minimize() from scipy.optimize with the above constraints, bounds, objective function and an initial guess
  • add a column to the dataframe with the result and call reindex() to restore the original row order.

Output:

            uw        ul  Sector Code
0     0.006822  0.050000           40
1     0.017949  0.050000           40
2     0.001906  0.031289           40
3     0.000904  0.040318           20
4     0.001147  0.046904           15
...        ...       ...          ...
1226  0.003653  0.033553           10
1227  0.002556  0.031094           10
1228  0.002816  0.041031           10
1229  0.010216  0.050000           40
1230  0.001559  0.033480           55

[1231 rows x 3 columns]
     fun: 0.4487707682194904
     jac: array([0.02089997, 0.00466947, 0.01358654, ..., 0.02070332,        nan,
       0.02188896])
 message: 'Positive directional derivative for linesearch'
    nfev: 919
     nit: 5
    njev: 1
  status: 8
 success: False
       x: array([0.03730054, 0.0247585 , 0.02171931, ..., 0.03300862, 0.05      ,
       0.03348039])
            uw        ul  Sector Code         w
0     0.006822  0.050000           40  0.050000
1     0.017949  0.050000           40  0.050000
2     0.001906  0.031289           40  0.031289
3     0.000904  0.040318           20  0.040318
4     0.001147  0.046904           15  0.046904
...        ...       ...          ...       ...
1226  0.003653  0.033553           10  0.033553
1227  0.002556  0.031094           10  0.031094
1228  0.002816  0.041031           10  0.041031
1229  0.010216  0.050000           40  0.050000
1230  0.001559  0.033480           55  0.033480

[1231 rows x 4 columns]

Note that success is False, so perhaps some work remains. Hopefully the dataframe related manipulations are helpful in addressing your question.

Upvotes: 1

Related Questions