Optimisation issue in R

Question

Question relating to optim function in R

I have the following code so far and need to know to input my values of X and T. X is a vector of 10 values and T is vector of 10*2 values relating to the means and variances. I want the output to be in the format of one new value for alpha, mean1, mean2, var1, and var2. Not sure how to get the input data in properly.

I want to run all values of X in this function but only the first row of T (10 values) and im not sure how to do this for T. I have a different function for the 2nd row.

R <-runif(10, 0, 1)
S <-1-R
T <-t(cbind(R,S))


X <- runif(10, 25, 35)


Data1 <- function(xy) { 
  alpha <- xy[1]
  mean1 <- xy[2]
  mean2 <- xy[3]
  var1 <- xy[4]
  var2 <- xy[5] 

  -sum(0.5*(((X)-mean1)/var1)^2+alpha*mean1+log(2.5*var1)+log(exp(-alpha*mean1)+exp(-alpha*mean2))*(T))
}
starting_values <- c(0.3, 28, 38, 4, 3)
optim(starting_values, Data1, lower=c(0, 0, 0, 0, 0), method='L-BFGS-B')

also getting the following error code:

Error in optim(starting_values, Data1, lower = c(0, 0, 0, 0, 0), method = "L-BFGS-B") : 
  L-BFGS-B needs finite values of 'fn'

Cheers for any help.

EDIT

second function for inclusion

0.5*((y1-mean2)/var2)^2+alpha*mean2+log(2.5*var2)+ log(exp(-alpha*mean1)+exp(-alpha*mean2)))*T

Ok so to explain as clearly as possible what i want to do. The first function in the original post above takes all 10 values of X one at a time and should take the first row of T data (labelled R here) and im not sure how to do this.

The second function detailed above should again take all 10 values of X in succession and then the second row of data from T (labelled as S below)

all this is then summed together. The five unknown parameters are thusly estimated.

T

       [,1]      [,2]      [,3]       [,4]      [,5]        [,6]      [,7]      [,8]      [,9]     [,10]
R 0.1477715 0.3055021 0.2963543 0.04149945 0.8342484 0.996865333 0.1592568 0.4623762 0.8000778 0.6979342
S 0.8522285 0.6944979 0.7036457 0.95850055 0.1657516 0.003134667 0.8407432 0.5376238 0.1999222 0.3020658

Edit2

im not getting the same values as ben, even with running the same seed. I have checked that i have all the packages installed and it would appear i do. Im not getting the same final answers and im also unable to call an individual item of opt2$par. Instead of providing reams of output, i will provide the first few lines and the last few.

0.3 28 38 4 3 -74.97014 -120.7212 
Loading required package: BB
Loading required package: quadprog
Loading required package: ucminf
Loading required package: Rcgmin
Loading required package: Rvmmin

Attaching package: ‘Rvmmin’

The following object(s) are masked from ‘package:optimx’:

    optansout

Loading required package: minqa
Loading required package: Rcpp
0.3 28 38 4 3 -74.97014 -120.7212 
0.9501 28 38 4 3 -176.3368 -265.9074 
1.9001 28 38 4 3 -324.7782 -478.4652 
0.9501 28.95 38 4 3 -179.9994 -260.8711 
0.9501 28 38.95 4 3 -176.3366 -283.0445 
0.9501 28 38 4.95 3 -176.7836 -265.9074 
0.9501 28 38 4 3.95 -176.3368 -254.6188

.................

16.32409 27.86113 38.54337 3.940143 2.504167 -2566.194 -3826.233 
16.32409 27.86113 38.54337 3.940044 2.504167 -2566.194 -3826.233 
16.32409 27.86113 38.54337 3.940093 2.504199 -2566.194 -3826.232 
16.32409 27.86113 38.54337 3.940093 2.504136 -2566.194 -3826.234 
> opt2$par
$par
[1] 16.324085 27.861134 38.543373  3.940093  2.504167

> opt2$par["mean1"]
$
NULL

Ben Bolker · Accepted Answer

A first crack: I used your code above. I added set.seed(101) at the beginning for reproducibility.

Reformatted the function slightly for clarity, but without changing anything significant, and added a cat() statement for debugging purposes:

Data1 <- function(xy) {
    alpha <- xy[1]; mean1 <- xy[2]; mean2 <- xy[3]
    var1 <- xy[4]; var2 <- xy[5]

    r1 <- -sum(0.5*((X-mean1)/var1)^2+
           alpha*mean1+
           log(2.5*var1)+
           log(exp(-alpha*mean1)+
               exp(-alpha*mean2))*T[1,])
    r2 <- -sum(0.5*((X-mean2)/var2)^2+
           alpha*mean2+
           log(2.5*var2)+
           log(exp(-alpha*mean1)+exp(-alpha*mean2))*T[2,])

    cat(xy,r1,r2,"
")
   r1+r2
}

A slightly compressed version, that (1) takes advantage of with to make the function cleaner; (2) uses R's replication and vector-recycling capabilities

Data2 <- function(xy) {
    with(as.list(xy),
    {
       mmat <- rep(c(mean1,mean2),each=ncol(T))
       vmat <- rep(c(var1,var2),each=ncol(T))
       -sum(0.5*((X-mmat)/vmat)^2+
          alpha*mmat+
          log(2.5*vmat)+
          log(exp(-alpha*mean1)+exp(-alpha*mean2))*T)
    })
}

Now we need a named vector of starting values:

 starting_values <- c(alpha=0.3, mean1=28, mean2=38, var1=4, var2=3)

Check that the results match:

 Data1(starting_values) ##  [1] -195.6913
 Data2(starting_values) ##  [1] -195.6913

This fails (but gives us information on how it fails):

 optim(par=starting_values, Data1, lower=rep(1e-4,5), method='L-BFGS-B',
     control=list(trace=6))

It produces a lot of output, ending with:

##  21.29998 27.97361 37.98915 4.011199 3.001 -6014.225 
## 21.29998 27.97361 37.98915 4.011199 2.999 -6014.225 
## 85.29991 27.89318 37.95606 4.04533 3 Inf 
## Error in optim(par = starting_values, Data1, lower = rep(1e-04, 5), 
##    method = "L-BFGS-B",  : 
##     L-BFGS-B needs finite values of 'fn'

This at least tells you where things went wrong. I would now try evaluating your expression piece-by-piece to see which bit overflowed.

As a commenter (Justin) in the chat room said,

your third term log(exp(...) + exp(...)) goes to -Inf very quickly since alpha, mean1 and mean2 are unbounded. exp(-large number * large number) ~ 0

For further debugging, you can:

try to rearrange the evaluation of your function to avoid overflows
set upper bounds on some of the parameters to avoid overflows
have the function test and return very large values rather than Inf in appropriate cases

Unfortunately, L-BFGS-B is more fragile than some of the other optimizers, and doesn't allow non-finite values.

Next I tried the bobyqa optimizer from the optimx package, which allows bounds and handles non-finite values (and is a derivative-free method, which in general tend to be slightly slower but more robust than the derivative-based methods): it seems to work OK, although I don't know if the answers are sensible or not.

library(optimx)
opt2 <- optimx(par=starting_values, 
      Data1, lower=rep(1e-4,5), method='bobyqa')
opt3 <- optimx(par=starting_values, 
      Data2, lower=rep(1e-4,5), method='bobyqa')

Looks OK (provided this is a sensible answer, which I don't know).

> opt2$par
$par
    alpha     mean1     mean2      var1      var2 
16.330752 27.815324 38.497483  3.894179  2.447219 

> opt3$par
$par
    alpha     mean1     mean2      var1      var2 
16.330900 27.820813 38.491290  3.887975  2.456052

Note that the answers are slightly different (by about 0.5% in the case of var2), which suggests that the fit may be slightly unstable/the surface may be quite flat. (Data1 and Data2 are supposed to give identical answers, and do so for the starting values, but I guess the order of operations makes them give very slightly different answers for some inputs -- or I screwed up somewhere ...)

To extract an individual component from this fit, e.g. mean1, use vector indexing:

opt3$par["mean1"]  ## 27.820813

Optimisation issue in R

Answers (1)

Related Questions