Find the value of variables to maximize return of function in Python

Question

I'd want to achieve similar result as how the Solver-function in Excel is working. I've been reading of Scipy optimization and been trying to build a function which outputs what I would like to find the maximal value of. The equation is based on four different variables which, see my code below:

import pandas as pd
import numpy as np
from scipy import optimize

cols = {
    'Dividend2': [9390, 7448, 177], 
    'Probability': [341, 376, 452], 
    'EV': [0.53, 0.60, 0.55], 
    'Dividend': [185, 55, 755], 
    'EV2': [123, 139, 544],
}

df = pd.DataFrame(cols)

def myFunc(params):
    """myFunc metric."""
    (ev, bv, vc, dv) = params
    df['Number'] = np.where(df['Dividend2'] <= vc, 1, 0) \
                    + np.where(df['EV2'] <= dv, 1, 0)
    df['Return'] =  np.where(
        df['EV'] <= ev, 0, np.where(
            df['Probability'] >= bv, 0, df['Number'] * df['Dividend'] - (vc + dv)
        )
    )
    return -1 * (df['Return'].sum())

b1 = [(0.2,4), (300,600), (0,1000), (0,1000)]
start = [0.2, 600, 1000, 1000]
result = optimize.minimize(fun=myFunc, bounds=b1, x0=start)
print(result)

So I'd like to find the maximum value of the column Return in df when changing the variables ev,bv,vc & dv. I'd like them to be between in the intervals of ev: 0.2-4, bv: 300-600, vc: 0-1000 & dv: 0-1000.

When running my code it seem like the function stops at x0.

CypherX · Accepted Answer

Solution

I will use optuna library to give you a solution to the type of problem you are trying to solve. I have tried using scipy.optimize.minimize and it appears that the loss-landscape is probably quite flat in most places, and hence the tolerances enforce the minimizing algorithm (L-BFGS-B) to stop prematurely.

Optuna Docs: https://optuna.readthedocs.io/en/stable/index.html

With optuna, it rather straight forward. Optuna only requires an objective function and a study. The study send various trials to the objective function, which in turn, evaluates the metric of your choice.

I have defined another metric function myFunc2 by mostly removing the np.where calls, as you can do-away with them (reduces number of steps) and make the function slightly faster.

# install optuna with pip
pip install -Uqq optuna

Although I looked into using a rather smooth loss landscape, sometimes it is necessary to visualize the landscape itself. The answer in section B elaborates on visualization. But, what if you want to use a smoother metric function? Section D sheds some light on this.

Order of code-execution should be:

Sections: C >> B >> B.1 >> B.2 >> B.3 >> A.1 >> A.2 >> D

A. Building Intuition

If you create a hiplot (also known as a plot with parallel-coordinates) with all the possible parameter values as mentioned in the search_space for Section B.2, and plot the lowest 50 outputs of myFunc2, it would look like this:

Plotting all such points from the search_space would look like this:

A.1. Loss Landscape Views for Various Parameter-Pairs

These figures show that mostly the loss-landscape is flat for any two of the four parameters (ev, bv, vc, dv). This could be a reason why, only GridSampler (which brute-forces the searching process) does better, compared to the other two samplers (TPESampler and RandomSampler). Please click on any of the images below to view them enlarged. This could also be the reason why scipy.optimize.minimize(method="L-BFGS-B") fails right off the bat.

`01. dv-vc`	`02. dv-bv`	`03. dv-ev`
`04. bv-ev`	`05. cv-ev`	`06. vc-bv`

# Create contour plots for parameter-pairs
study_name = "GridSampler"
study = studies.get(study_name)

views = [("dv", "vc"), ("dv", "bv"), ("dv", "ev"), 
         ("bv", "ev"), ("vc", "ev"), ("vc", "bv")]

for i, (x, y) in enumerate(views):
    print(f"Figure: {i}/{len(views)}")
    study_contour_plot(study=study, params=(x, y))

A.2. Parameter Importance

study_name = "GridSampler"
study = studies.get(study_name)

fig = optuna.visualization.plot_param_importances(study)
fig.update_layout(title=f'Hyperparameter Importances: {study.study_name}', 
                  autosize=False,
                  width=800, height=500,
                  margin=dict(l=65, r=50, b=65, t=90))
fig.show()

B. Code

Section B.3. finds the lowest metric -88.333 for:

{'ev': 0.2, 'bv': 500.0, 'vc': 222.2222, 'dv': 0.0}

import warnings
from functools import partial
from typing import Iterable, Optional, Callable, List

import pandas as pd
import numpy as np
import optuna
from tqdm.notebook import tqdm

warnings.filterwarnings("ignore", category=optuna.exceptions.ExperimentalWarning)
optuna.logging.set_verbosity(optuna.logging.WARNING)

PARAM_NAMES: List[str] = ["ev", "bv", "vc", "dv",]
DEFAULT_METRIC_FUNC: Callable = myFunc2


def myFunc2(params):
    """myFunc metric v2 with lesser steps."""
    global df # define as a global variable
    (ev, bv, vc, dv) = params
    df['Number'] = (df['Dividend2'] <= vc) * 1 + (df['EV2'] <= dv) * 1
    df['Return'] =  (
        (df['EV'] > ev) 
        * (df['Probability'] < bv) 
        * (df['Number'] * df['Dividend'] - (vc + dv))
    )
    return -1 * (df['Return'].sum())


def make_param_grid(
        bounds: List[Tuple[float, float]], 
        param_names: Optional[List[str]]=None, 
        num_points: int=10, 
        as_dict: bool=True,
    ) -> Union[pd.DataFrame, Dict[str, List[float]]]:
    """
    Create parameter search space.

    Example:
    
        grid = make_param_grid(bounds=b1, num_points=10, as_dict=True)
    
    """
    if param_names is None:
        param_names = PARAM_NAMES # ["ev", "bv", "vc", "dv"]
    bounds = np.array(bounds)
    grid = np.linspace(start=bounds[:,0], 
                       stop=bounds[:,1], 
                       num=num_points, 
                       endpoint=True, 
                       axis=0)
    grid = pd.DataFrame(grid, columns=param_names)
    if as_dict:
        grid = grid.to_dict()
        for k,v in grid.items():
            grid.update({k: list(v.values())})
    return grid


def objective(trial, 
              bounds: Optional[Iterable]=None, 
              func: Optional[Callable]=None, 
              param_names: Optional[List[str]]=None):
    """Objective function, necessary for optimizing with optuna."""
    if param_names is None:
        param_names = PARAM_NAMES
    if (bounds is None):
        bounds = ((-10, 10) for _ in param_names)
    if not isinstance(bounds, dict):
        bounds = dict((p, (min(b), max(b))) 
                        for p, b in zip(param_names, bounds))
    if func is None:
        func = DEFAULT_METRIC_FUNC

    params = dict(
        (p, trial.suggest_float(p, bounds.get(p)[0], bounds.get(p)[1])) 
        for p in param_names        
    )
    # x = trial.suggest_float('x', -10, 10)
    return func((params[p] for p in param_names))


def optimize(objective: Callable, 
             sampler: Optional[optuna.samplers.BaseSampler]=None, 
             func: Optional[Callable]=None, 
             n_trials: int=2, 
             study_direction: str="minimize",
             study_name: Optional[str]=None,
             formatstr: str=".4f",
             verbose: bool=True):
    """Optimizing function using optuna: creates a study."""
    if func is None:
        func = DEFAULT_METRIC_FUNC
    study = optuna.create_study(
        direction=study_direction, 
        sampler=sampler, 
        study_name=study_name)
    study.optimize(
        objective, 
        n_trials=n_trials, 
        show_progress_bar=True, 
        n_jobs=1,
    )
    if verbose:
        metric = eval_metric(study.best_params, func=myFunc2)
        msg = format_result(study.best_params, metric, 
                            header=study.study_name, 
                            format=formatstr)
        print(msg)
    return study


def format_dict(d: Dict[str, float], format: str=".4f") -> Dict[str, float]:
    """
    Returns formatted output for a dictionary with 
    string keys and float values.
    """
    return dict((k, float(f'{v:{format}}')) for k,v in d.items())


def format_result(d: Dict[str, float], 
                  metric_value: float, 
                  header: str='', 
                  format: str=".4f"): 
    """Returns formatted result."""
    msg = f"""Study Name: {header}
{'='*30}
    
    ✅ study.best_params: 
	{format_dict(d)}
    ✅ metric: {metric_value} 
    """
    return msg


def study_contour_plot(study: optuna.Study, 
                       params: Optional[List[str]]=None, 
                       width: int=560, 
                       height: int=500):
    """
    Create contour plots for a study, given a list or 
    tuple of two parameter names.
    """
    if params is None:
        params = ["dv", "vc"]
    fig = optuna.visualization.plot_contour(study, params=params)
    fig.update_layout(
        title=f'Contour Plot: {study.study_name} ({params[0]}, {params[1]})', 
        autosize=False,
        width=width, 
        height=height,
        margin=dict(l=65, r=50, b=65, t=90))
    fig.show()


bounds = [(0.2, 4), (300, 600), (0, 1000), (0, 1000)]
param_names = PARAM_NAMES # ["ev", "bv", "vc", "dv",]
pobjective = partial(objective, bounds=bounds)

# Create an empty dict to contain 
# various subsequent studies.
studies = dict()

Optuna comes with a few different types of Samplers. Samplers provide the strategy of how optuna is going to sample points from the parametr-space and evaluate the objective function.

https://optuna.readthedocs.io/en/stable/reference/samplers.html

B.1 Use `TPESampler`

from optuna.samplers import TPESampler

sampler = TPESampler(seed=42)

study_name = "TPESampler"
studies[study_name] = optimize(
    pobjective, 
    sampler=sampler, 
    n_trials=100, 
    study_name=study_name,
)

# Study Name: TPESampler
# ==============================
#    
#     ✅ study.best_params: 
#   {'ev': 1.6233, 'bv': 585.2143, 'vc': 731.9939, 'dv': 598.6585}
#     ✅ metric: -0.0

B.2. Use `GridSampler`

GridSampler requires a parameter search grid. Here we are using the following search_space.

from optuna.samplers import GridSampler

# create search-space
search_space = make_param_grid(bounds=bounds, num_points=10, as_dict=True)

sampler = GridSampler(search_space)

study_name = "GridSampler"
studies[study_name] = optimize(
    pobjective, 
    sampler=sampler, 
    n_trials=2000, 
    study_name=study_name,
)

# Study Name: GridSampler
# ==============================
#    
#     ✅ study.best_params: 
#   {'ev': 0.2, 'bv': 500.0, 'vc': 222.2222, 'dv': 0.0}
#     ✅ metric: -88.33333333333337

B.3. Use `RandomSampler`

from optuna.samplers import RandomSampler

sampler = RandomSampler(seed=42)

study_name = "RandomSampler"
studies[study_name] = optimize(
    pobjective, 
    sampler=sampler, 
    n_trials=300, 
    study_name=study_name,
)

# Study Name: RandomSampler
# ==============================
#    
#     ✅ study.best_params: 
#   {'ev': 1.6233, 'bv': 585.2143, 'vc': 731.9939, 'dv': 598.6585}
#     ✅ metric: -0.0

C. Dummy Data

For the sake of reproducibility, I am keeping a record of the dummy data used here.

import pandas as pd
import numpy as np
from scipy import optimize

cols = {
    'Dividend2': [9390, 7448, 177], 
    'Probability': [341, 376, 452], 
    'EV': [0.53, 0.60, 0.55], 
    'Dividend': [185, 55, 755], 
    'EV2': [123, 139, 544],
}

df = pd.DataFrame(cols)

def myFunc(params):
    """myFunc metric."""
    (ev, bv, vc, dv) = params
    df['Number'] = np.where(df['Dividend2'] <= vc, 1, 0) \
                    + np.where(df['EV2'] <= dv, 1, 0)
    df['Return'] =  np.where(
        df['EV'] <= ev, 0, np.where(
            df['Probability'] >= bv, 0, df['Number'] * df['Dividend'] - (vc + dv)
        )
    )
    return -1 * (df['Return'].sum())

b1 = [(0.2,4), (300,600), (0,1000), (0,1000)]
start = [0.2, 600, 1000, 1000]
result = optimize.minimize(fun=myFunc, bounds=b1, x0=start)
print(result)

C.1. An Observation

So, it seems at first glance that the code executed properly and did not throw any error. It says it had success in finding the minimized solution.

      fun: -0.0
 hess_inv: <4x4 LbfgsInvHessProduct with dtype=float64>
      jac: array([0., 0., 3., 3.])
  message: b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL' # 💡
     nfev: 35
      nit: 2
   status: 0
  success: True
        x: array([2.e-01, 6.e+02, 0.e+00, 0.e+00]) # 🔥

A close observation reveals that the solution (see 🔥) is no different from the starting point [0.2, 600, 1000, 1000]. So, seems like nothing really happened and the algorithm just finished prematurely?!!

Now look at the message above (see 💡). If we run a google search on this, you could find something like this:

Summary

b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'

If the loss-landscape does not have a smoothely changing topography, the gradient descent algorithms will soon find that from one iteration to the next, there isn't much change happening and hence, will terminate further seeking. Also, if the loss-landscape is rather flat, this could see similar fate and get early-termination.
- scipy-optimize-minimize does not perform the optimization - CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL

D. Making the Loss Landscape Smoother

A binary evaluation of value = 1 if x>5 else 0 is essentially a step-function that assigns 1 for all values of x that are greater than 5 and 0 otherwise. But this introduces a kink - a discontinuity in smoothness and this could potentially introduce problems in traversing the loss-landscape.

What if we use a sigmoid function to introduce some smoothness?

$https://latex.codecogs.com/svg.image?\large {\color{Pink} \sigma(x) = \frac{1}{1 + e^{-x}}}$

# Define sigmoid function
def sigmoid(x):
    """Sigmoid function."""
    return 1 / (1 + np.exp(-x))

For the above example, we could modify it as follows.

$https://latex.codecogs.com/svg.image?\large {\color{Pink} \sigma((x - 5)) = \frac{1}{1 + e^{-(x - 5)}}}$

You can additionally introduce another factor (gamma: γ) as follows and try to optimize it to make the landscape smoother. Thus by controlling the gamma factor, you could make the function smoother and change how quickly it changes around x = 5

$https://latex.codecogs.com/svg.image?\large {\color{Pink} \sigma((x - 5)/\gamma) = \frac{1}{1 + e^{-(x - 5)/\gamma}}}$

The above figure is created with the following code-snippet.

import matplotlib.pyplot as plt

%matplotlib inline 
%config InlineBackend.figure_format = 'svg' # 'svg', 'retina' 
plt.style.use('seaborn-white')

def make_figure(figtitle: str="Sigmoid Function"):
    """Make the demo figure for using sigmoid."""

    x = np.arange(-20, 20.01, 0.01)
    y1 = sigmoid(x)
    y2 = sigmoid(x - 5)
    y3 = sigmoid((x - 5)/3)
    y4 = sigmoid((x - 5)/0.3)
    fig, ax = plt.subplots(figsize=(10,5))
    plt.sca(ax)
    plt.plot(x, y1, ls="-", label="$\sigma(x)$")
    plt.plot(x, y2, ls="--", label="$\sigma(x - 5)$")
    plt.plot(x, y3, ls="-.", label="$\sigma((x - 5) / 3)$")
    plt.plot(x, y4, ls=":", label="$\sigma((x - 5) / 0.3)$")
    plt.axvline(x=0, ls="-", lw=1.3, color="cyan", alpha=0.9)
    plt.axvline(x=5, ls="-", lw=1.3, color="magenta", alpha=0.9)
    plt.legend()
    plt.title(figtitle)
    plt.show()

make_figure()

D.1. Example of Metric Smoothing

The following is an example of how you could apply function smoothing.

from functools import partial

def sig(x, gamma: float=1.):
    return sigmoid(x/gamma)

def myFunc3(params, gamma: float=0.5):
    """myFunc metric v3 with smoother metric."""
    (ev, bv, vc, dv) = params
    _sig = partial(sig, gamma=gamma)
    df['Number'] = _sig(x = -(df['Dividend2'] - vc)) * 1 \
                    + _sig(x = -(df['EV2'] - dv)) * 1
    df['Return'] = (
        _sig(x = df['EV'] - ev) 
        * _sig(x = -(df['Probability'] - bv))
        * _sig(x = df['Number'] * df['Dividend'] - (vc + dv))
    )
    return -1 * (df['Return'].sum())

Find the value of variables to maximize return of function in Python

Answers (2)

Solution

A. Building Intuition

A.1. Loss Landscape Views for Various Parameter-Pairs

A.2. Parameter Importance

B. Code

B.1 Use `TPESampler`

B.2. Use `GridSampler`

B.3. Use `RandomSampler`

C. Dummy Data

C.1. An Observation

D. Making the Loss Landscape Smoother

D.1. Example of Metric Smoothing

Related Questions

Find the value of variables to maximize return of function in Python

Answers (2)

Solution

A. Building Intuition

A.1. Loss Landscape Views for Various Parameter-Pairs

A.2. Parameter Importance

B. Code

B.1 Use TPESampler

B.2. Use GridSampler

B.3. Use RandomSampler

C. Dummy Data

C.1. An Observation

D. Making the Loss Landscape Smoother

D.1. Example of Metric Smoothing

Related Questions

B.1 Use `TPESampler`

B.2. Use `GridSampler`

B.3. Use `RandomSampler`