How to solve a simple optimization in R

Question

I am trying to use the optim() function in R to solve a simple problem, but I am facing some problems on how to implement it:

 e=tot_obs/(sum(Var1)+sum(Var2)+sum(Var3)+sum(Var4))
 output=(Var1+Var2+Var3+Var4)*e

I know the total of the observations and all the variables.

# Fake datasets   
# Considering that this are the observations c(1000,250,78,0,0,90)

#Known data
total_observations=1418
var1=c(1,0.3,0.5,0.01,0.05,0.6)
var2=c(500,40,40,0,0,100)
var3=c(1,0.1,0.2,0,0.1,0)
var4=c(2,0.04,0.003,0.003,0,0.05)

#Function
e=total_observations/(sum(var1)+sum(var2)+sum(var3)+sum(var4))
output=(var1+var2+var3+var4)*e

I can do a simple correlation between observations and output, with good results (~0.90). This one gives me 0.97.

But now I want to test the effect of having different weights assign to each variable.

e=tot_obs/(sum(w1*Var1)+sum(w2*Var2)+sum(w3*Var3)+sum(w4*Var4))
output=(w1*Var1+w2*Var2+w3*Var3+w4*Var4)*e
where w1+w2+w3+w4=1
and cor(observations,output)~1

I was trying to use optim() function, however I am completely lost. If anyone could help me out or point me some good references on how to do this, I would appreciate.

plannapus · Accepted Answer

You need to use function solnp in package Rsolnpbecause it allows constraints based on an equality.

The idea is to build a function to minimize and your equality function for the constraint.

Fun <- function(param){
    e <- total_observations/(sum(param[1]*var1)+sum(param[2]*var2)+sum(param[3]*var3)+sum(param[4]*var4))
    output <- (param[1]*var1 + param[2]*var2 + param[3]*var3 + param[4]*var4)/e
    -cor(output, observations) #We want to maximize cor and therefore minimize -cor
    }

eqn <- function(param){sum(param)}

With your example data:

observations <- c(1000,250,78,0,0,90)
total_observations=1418
var1=c(1,0.3,0.5,0.01,0.05,0.6)
var2=c(500,40,40,0,0,100)
var3=c(1,0.1,0.2,0,0.1,0)
var4=c(2,0.04,0.003,0.003,0,0.05)

Your optimization:

solnp(c(.1,.2,.3,.4),fun=Fun, eqfun=eqn, eqB=1)

Iter: 1 fn: -0.9793  Pars:  0.1395748 0.0008403 0.3881053 0.4714796
Iter: 2 fn: -0.9793  Pars:  0.1395531 0.0008406 0.3881409 0.4714653
solnp--> Completed in 2 iterations
$pars
[1] 0.1395530843 0.0008406453 0.3881409239 0.4714653466

$convergence
[1] 0

$values
[1] -0.9729894 -0.9793458 -0.9793458

$lagrange
             [,1]
[1,] 2.521018e-06

$hessian
           [,1]        [,2]        [,3]        [,4]
[1,]  0.4843670   5.0498894 -0.08329380  0.39560040
[2,]  5.0498894 699.5317385 -2.38763807 -0.65610831
[3,] -0.0832938  -2.3876381  0.91837245 -0.09486495
[4,]  0.3956004  -0.6561083 -0.09486495  0.43979850

$ineqx0
NULL

$nfuneval
[1] 709

$outer.iter
[1] 2

$elapsed
Time difference of 0.2371149 secs

If you save that into a variable res, what you're looking for is stored in res$pars.

How to solve a simple optimization in R

Answers (1)

Related Questions