Reputation: 65
I have a data frame of simulated observations and I am trying to figure an optimization to get the greatest amount of successes for a given level of risk. The issue is I have no clue how to program in the constraints. Here is a subset of my DF, Rows are teams and X1:X10 are simulated results of successes and failures
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 0 1 0 1 0 0 0 1 1 0
2 0 1 0 0 1 1 1 1 1 1
3 1 0 0 0 0 0 0 1 1 0
4 1 0 1 1 1 0 0 1 0 1
5 0 0 1 0 1 0 1 1 1 0
6 0 1 0 1 0 1 1 0 1 1
7 1 0 0 0 0 1 1 1 1 0
8 0 0 1 1 0 0 1 1 0 1
9 1 0 0 0 0 1 1 1 1 0
10 0 1 1 0 0 1 0 1 0 0
The goal would be to maximize sum(colMeans(DF))
Based on the comment below I figured I would write out the math I am trying to solve. There are several different ways to do this it seems from research but I have no clue how to write the constraints. My larger dataset is 10,000 X 10,000.
One Approach
Objective Function = Max(average(sum(Simutaltion [i])
Constraints:
Observations are binary
Observations = 4
SD(portfolio)) < X , or Min(success) > 3
Other Approach:
Objective Function = Min(SD(sum(Simutaltion [i])) or Max((average(sum(Simutaltion [i])/(SD(portfolio)
Constraints:
Observations are binary
Observations = 4
Average(sum(Simulatin[i])) > 5
So I have been playing around with this some more, and I am starting to get somewhere.
library(quantprog)
library(Matrix)
dmat = cov(t(df))
dmat = dmat$mat
dvec = rowMeans(df)
n = nrow(dmat)
Amat = matrix(1,nrow = n)
bvec = 1
meq = 1
sol <- solve.QP(dmat, dvec = -dvec, Amat = Amat, meq = meq)
sol
$solution
[1] -2.2700081 -1.0040420 5.9587712 -2.8265966 0.2037187 1.1641067 -2.0099030 1.5024252 -2.0099030
[10] 1.2914309
$value
[1] -0.730477
$unconstrained.solution
[1] 6.750 6.750 6.750 -81.000 -38.250 -78.750 -12.375 36.000 -12.375 -29.250
$iterations
[1] 2 0
$Lagrangian
[1] 0.5626516
$iact
[1] 1
When I make this change to bvec = c(1, rep(0|-.25, n))
I get a solutions that is essentially all 0,
$solution
[1] 6.018572e-17 -5.455221e-16 4.873879e-13 -4.176466e-15 -8.236146e-17 -2.663993e-16 9.764273e-17
[8] -4.092027e-17 -2.630662e-13 0.000000e+00
Used tseries and got these results,
portfolio.optim(t(df), covmat = as.matrix(dmat) )
$pw
[1] -6.146157e-17 8.336774e-20 1.818182e-01 2.272727e-01 1.212121e-01 3.030303e-01 0.000000e+00
[8] 2.658696e-17 6.164972e-15 1.666667e-01
$px
[1] 0.4090909 0.4696970 0.5151515 0.5303030 0.3484848 0.4696970 0.4242424 0.6969697 0.6060606 0.5303030
$pm
[1] 0.5
$ps
[1] 0.1005038
If I can get a weight constraint for tseries that would work too.
Any tips on editing the constraints also would like to add a min success rate as a constraint as well. Thanks,
Upvotes: 0
Views: 809
Reputation: 16724
I would approach this in two steps. First write down the model in math, and then try to implement this model in a piece of software. Lets start with the simplest, a linear model. Using a standard deviation inside a constraint asks for heavy machinery (the model becomes an MINLP or at least quadratically constrained mixed integer problem).
I believe the mathematical model can be stated as:
Here n=4 and d is your data frame as a matrix. We can even simplify more by observing that the risk equation can be done during preprocessing: drop all teams from the dataframe if their rowsum is less than 2.
Now that we have the model we need to get it into a form accepted by a solver. I saw you used LpSolveAPI. As the model is very simple, it is not too difficult to cast this model in calls to the lpSolveAPI.
Upvotes: 2