Reputation: 1971
I am trying to use the optim() function in R to solve a simple problem, but I am facing some problems on how to implement it:
e=tot_obs/(sum(Var1)+sum(Var2)+sum(Var3)+sum(Var4))
output=(Var1+Var2+Var3+Var4)*e
I know the total of the observations and all the variables.
# Fake datasets
# Considering that this are the observations c(1000,250,78,0,0,90)
#Known data
total_observations=1418
var1=c(1,0.3,0.5,0.01,0.05,0.6)
var2=c(500,40,40,0,0,100)
var3=c(1,0.1,0.2,0,0.1,0)
var4=c(2,0.04,0.003,0.003,0,0.05)
#Function
e=total_observations/(sum(var1)+sum(var2)+sum(var3)+sum(var4))
output=(var1+var2+var3+var4)*e
I can do a simple correlation between observations and output, with good results (~0.90). This one gives me 0.97.
But now I want to test the effect of having different weights assign to each variable.
e=tot_obs/(sum(w1*Var1)+sum(w2*Var2)+sum(w3*Var3)+sum(w4*Var4))
output=(w1*Var1+w2*Var2+w3*Var3+w4*Var4)*e
where w1+w2+w3+w4=1
and cor(observations,output)~1
I was trying to use optim() function, however I am completely lost. If anyone could help me out or point me some good references on how to do this, I would appreciate.
Upvotes: 4
Views: 2090
Reputation: 18759
You need to use function solnp
in package Rsolnp
because it allows constraints based on an equality.
The idea is to build a function to minimize and your equality function for the constraint.
Fun <- function(param){
e <- total_observations/(sum(param[1]*var1)+sum(param[2]*var2)+sum(param[3]*var3)+sum(param[4]*var4))
output <- (param[1]*var1 + param[2]*var2 + param[3]*var3 + param[4]*var4)/e
-cor(output, observations) #We want to maximize cor and therefore minimize -cor
}
eqn <- function(param){sum(param)}
With your example data:
observations <- c(1000,250,78,0,0,90)
total_observations=1418
var1=c(1,0.3,0.5,0.01,0.05,0.6)
var2=c(500,40,40,0,0,100)
var3=c(1,0.1,0.2,0,0.1,0)
var4=c(2,0.04,0.003,0.003,0,0.05)
Your optimization:
solnp(c(.1,.2,.3,.4),fun=Fun, eqfun=eqn, eqB=1)
Iter: 1 fn: -0.9793 Pars: 0.1395748 0.0008403 0.3881053 0.4714796
Iter: 2 fn: -0.9793 Pars: 0.1395531 0.0008406 0.3881409 0.4714653
solnp--> Completed in 2 iterations
$pars
[1] 0.1395530843 0.0008406453 0.3881409239 0.4714653466
$convergence
[1] 0
$values
[1] -0.9729894 -0.9793458 -0.9793458
$lagrange
[,1]
[1,] 2.521018e-06
$hessian
[,1] [,2] [,3] [,4]
[1,] 0.4843670 5.0498894 -0.08329380 0.39560040
[2,] 5.0498894 699.5317385 -2.38763807 -0.65610831
[3,] -0.0832938 -2.3876381 0.91837245 -0.09486495
[4,] 0.3956004 -0.6561083 -0.09486495 0.43979850
$ineqx0
NULL
$nfuneval
[1] 709
$outer.iter
[1] 2
$elapsed
Time difference of 0.2371149 secs
If you save that into a variable res
, what you're looking for is stored in res$pars
.
Upvotes: 5