JB17
JB17

Reputation: 65

Optimization with Binary decision variable and simulated database

I have a data frame of simulated observations and I am trying to figure an optimization to get the greatest amount of successes for a given level of risk. The issue is I have no clue how to program in the constraints. Here is a subset of my DF, Rows are teams and X1:X10 are simulated results of successes and failures

    X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1   0  1  0  1  0  0  0  1  1   0
2   0  1  0  0  1  1  1  1  1   1
3   1  0  0  0  0  0  0  1  1   0
4   1  0  1  1  1  0  0  1  0   1
5   0  0  1  0  1  0  1  1  1   0
6   0  1  0  1  0  1  1  0  1   1
7   1  0  0  0  0  1  1  1  1   0
8   0  0  1  1  0  0  1  1  0   1
9   1  0  0  0  0  1  1  1  1   0
10  0  1  1  0  0  1  0  1  0   0

The goal would be to maximize sum(colMeans(DF))

Based on the comment below I figured I would write out the math I am trying to solve. There are several different ways to do this it seems from research but I have no clue how to write the constraints. My larger dataset is 10,000 X 10,000.

One Approach

Objective Function = Max(average(sum(Simutaltion [i])

Constraints:

Observations are binary

Observations = 4

SD(portfolio)) < X , or Min(success) > 3

Other Approach:

Objective Function = Min(SD(sum(Simutaltion [i])) or Max((average(sum(Simutaltion [i])/(SD(portfolio)

Constraints:

Observations are binary

Observations = 4

Average(sum(Simulatin[i])) > 5

So I have been playing around with this some more, and I am starting to get somewhere.

library(quantprog)
library(Matrix)
dmat = cov(t(df))
dmat = dmat$mat
dvec = rowMeans(df)
n = nrow(dmat)
Amat = matrix(1,nrow = n)
bvec = 1
meq = 1
sol <- solve.QP(dmat, dvec = -dvec, Amat = Amat, meq = meq)
sol
$solution
 [1] -2.2700081 -1.0040420  5.9587712 -2.8265966  0.2037187  1.1641067 -2.0099030  1.5024252 -2.0099030
[10]  1.2914309

$value
[1] -0.730477

$unconstrained.solution
 [1]   6.750   6.750   6.750 -81.000 -38.250 -78.750 -12.375  36.000 -12.375 -29.250

$iterations
[1] 2 0

$Lagrangian
[1] 0.5626516

$iact
[1] 1

When I make this change to bvec = c(1, rep(0|-.25, n)) I get a solutions that is essentially all 0,

$solution
 [1]  6.018572e-17 -5.455221e-16  4.873879e-13 -4.176466e-15 -8.236146e-17 -2.663993e-16  9.764273e-17
 [8] -4.092027e-17 -2.630662e-13  0.000000e+00

Used tseries and got these results,

portfolio.optim(t(df), covmat = as.matrix(dmat) )
$pw
 [1] -6.146157e-17  8.336774e-20  1.818182e-01  2.272727e-01  1.212121e-01  3.030303e-01  0.000000e+00
 [8]  2.658696e-17  6.164972e-15  1.666667e-01

$px
 [1] 0.4090909 0.4696970 0.5151515 0.5303030 0.3484848 0.4696970 0.4242424 0.6969697 0.6060606 0.5303030

$pm
[1] 0.5

$ps
[1] 0.1005038

If I can get a weight constraint for tseries that would work too.

Any tips on editing the constraints also would like to add a min success rate as a constraint as well. Thanks,

Upvotes: 0

Views: 809

Answers (1)

Erwin Kalvelagen
Erwin Kalvelagen

Reputation: 16724

I would approach this in two steps. First write down the model in math, and then try to implement this model in a piece of software. Lets start with the simplest, a linear model. Using a standard deviation inside a constraint asks for heavy machinery (the model becomes an MINLP or at least quadratically constrained mixed integer problem).

I believe the mathematical model can be stated as:

enter image description here

Here n=4 and d is your data frame as a matrix. We can even simplify more by observing that the risk equation can be done during preprocessing: drop all teams from the dataframe if their rowsum is less than 2.

Now that we have the model we need to get it into a form accepted by a solver. I saw you used LpSolveAPI. As the model is very simple, it is not too difficult to cast this model in calls to the lpSolveAPI.

Upvotes: 2

Related Questions