YesSure
YesSure

Reputation: 195

Optimisation issue in R

Question relating to optim function in R

I have the following code so far and need to know to input my values of X and T. X is a vector of 10 values and T is vector of 10*2 values relating to the means and variances. I want the output to be in the format of one new value for alpha, mean1, mean2, var1, and var2. Not sure how to get the input data in properly.

I want to run all values of X in this function but only the first row of T (10 values) and im not sure how to do this for T. I have a different function for the 2nd row.

R <-runif(10, 0, 1)
S <-1-R
T <-t(cbind(R,S))


X <- runif(10, 25, 35)


Data1 <- function(xy) { 
  alpha <- xy[1]
  mean1 <- xy[2]
  mean2 <- xy[3]
  var1 <- xy[4]
  var2 <- xy[5] 

  -sum(0.5*(((X)-mean1)/var1)^2+alpha*mean1+log(2.5*var1)+log(exp(-alpha*mean1)+exp(-alpha*mean2))*(T))
}
starting_values <- c(0.3, 28, 38, 4, 3)
optim(starting_values, Data1, lower=c(0, 0, 0, 0, 0), method='L-BFGS-B')

also getting the following error code:

Error in optim(starting_values, Data1, lower = c(0, 0, 0, 0, 0), method = "L-BFGS-B") : 
  L-BFGS-B needs finite values of 'fn'

Cheers for any help.

EDIT

second function for inclusion

0.5*((y1-mean2)/var2)^2+alpha*mean2+log(2.5*var2)+ log(exp(-alpha*mean1)+exp(-alpha*mean2)))*T

Ok so to explain as clearly as possible what i want to do. The first function in the original post above takes all 10 values of X one at a time and should take the first row of T data (labelled R here) and im not sure how to do this.

The second function detailed above should again take all 10 values of X in succession and then the second row of data from T (labelled as S below)

all this is then summed together. The five unknown parameters are thusly estimated.

T

       [,1]      [,2]      [,3]       [,4]      [,5]        [,6]      [,7]      [,8]      [,9]     [,10]
R 0.1477715 0.3055021 0.2963543 0.04149945 0.8342484 0.996865333 0.1592568 0.4623762 0.8000778 0.6979342
S 0.8522285 0.6944979 0.7036457 0.95850055 0.1657516 0.003134667 0.8407432 0.5376238 0.1999222 0.3020658

Edit2

im not getting the same values as ben, even with running the same seed. I have checked that i have all the packages installed and it would appear i do. Im not getting the same final answers and im also unable to call an individual item of opt2$par. Instead of providing reams of output, i will provide the first few lines and the last few.

0.3 28 38 4 3 -74.97014 -120.7212 
Loading required package: BB
Loading required package: quadprog
Loading required package: ucminf
Loading required package: Rcgmin
Loading required package: Rvmmin

Attaching package: ‘Rvmmin’

The following object(s) are masked from ‘package:optimx’:

    optansout

Loading required package: minqa
Loading required package: Rcpp
0.3 28 38 4 3 -74.97014 -120.7212 
0.9501 28 38 4 3 -176.3368 -265.9074 
1.9001 28 38 4 3 -324.7782 -478.4652 
0.9501 28.95 38 4 3 -179.9994 -260.8711 
0.9501 28 38.95 4 3 -176.3366 -283.0445 
0.9501 28 38 4.95 3 -176.7836 -265.9074 
0.9501 28 38 4 3.95 -176.3368 -254.6188 

.................

16.32409 27.86113 38.54337 3.940143 2.504167 -2566.194 -3826.233 
16.32409 27.86113 38.54337 3.940044 2.504167 -2566.194 -3826.233 
16.32409 27.86113 38.54337 3.940093 2.504199 -2566.194 -3826.232 
16.32409 27.86113 38.54337 3.940093 2.504136 -2566.194 -3826.234 
> opt2$par
$par
[1] 16.324085 27.861134 38.543373  3.940093  2.504167

> opt2$par["mean1"]
$<NA>
NULL

Upvotes: 0

Views: 2720

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226731

A first crack: I used your code above. I added set.seed(101) at the beginning for reproducibility.

Reformatted the function slightly for clarity, but without changing anything significant, and added a cat() statement for debugging purposes:

Data1 <- function(xy) {
    alpha <- xy[1]; mean1 <- xy[2]; mean2 <- xy[3]
    var1 <- xy[4]; var2 <- xy[5]

    r1 <- -sum(0.5*((X-mean1)/var1)^2+
           alpha*mean1+
           log(2.5*var1)+
           log(exp(-alpha*mean1)+
               exp(-alpha*mean2))*T[1,])
    r2 <- -sum(0.5*((X-mean2)/var2)^2+
           alpha*mean2+
           log(2.5*var2)+
           log(exp(-alpha*mean1)+exp(-alpha*mean2))*T[2,])

    cat(xy,r1,r2,"\n")
   r1+r2
}

A slightly compressed version, that (1) takes advantage of with to make the function cleaner; (2) uses R's replication and vector-recycling capabilities

Data2 <- function(xy) {
    with(as.list(xy),
    {
       mmat <- rep(c(mean1,mean2),each=ncol(T))
       vmat <- rep(c(var1,var2),each=ncol(T))
       -sum(0.5*((X-mmat)/vmat)^2+
          alpha*mmat+
          log(2.5*vmat)+
          log(exp(-alpha*mean1)+exp(-alpha*mean2))*T)
    })
}

Now we need a named vector of starting values:

 starting_values <- c(alpha=0.3, mean1=28, mean2=38, var1=4, var2=3)

Check that the results match:

 Data1(starting_values) ##  [1] -195.6913
 Data2(starting_values) ##  [1] -195.6913

This fails (but gives us information on how it fails):

 optim(par=starting_values, Data1, lower=rep(1e-4,5), method='L-BFGS-B',
     control=list(trace=6))

It produces a lot of output, ending with:

##  21.29998 27.97361 37.98915 4.011199 3.001 -6014.225 
## 21.29998 27.97361 37.98915 4.011199 2.999 -6014.225 
## 85.29991 27.89318 37.95606 4.04533 3 Inf 
## Error in optim(par = starting_values, Data1, lower = rep(1e-04, 5), 
##    method = "L-BFGS-B",  : 
##     L-BFGS-B needs finite values of 'fn'

This at least tells you where things went wrong. I would now try evaluating your expression piece-by-piece to see which bit overflowed.

As a commenter (Justin) in the chat room said,

your third term log(exp(...) + exp(...)) goes to -Inf very quickly since alpha, mean1 and mean2 are unbounded. exp(-large number * large number) ~ 0

For further debugging, you can:

  • try to rearrange the evaluation of your function to avoid overflows
  • set upper bounds on some of the parameters to avoid overflows
  • have the function test and return very large values rather than Inf in appropriate cases

Unfortunately, L-BFGS-B is more fragile than some of the other optimizers, and doesn't allow non-finite values.

Next I tried the bobyqa optimizer from the optimx package, which allows bounds and handles non-finite values (and is a derivative-free method, which in general tend to be slightly slower but more robust than the derivative-based methods): it seems to work OK, although I don't know if the answers are sensible or not.

library(optimx)
opt2 <- optimx(par=starting_values, 
      Data1, lower=rep(1e-4,5), method='bobyqa')
opt3 <- optimx(par=starting_values, 
      Data2, lower=rep(1e-4,5), method='bobyqa')

Looks OK (provided this is a sensible answer, which I don't know).

> opt2$par
$par
    alpha     mean1     mean2      var1      var2 
16.330752 27.815324 38.497483  3.894179  2.447219 

> opt3$par
$par
    alpha     mean1     mean2      var1      var2 
16.330900 27.820813 38.491290  3.887975  2.456052 

Note that the answers are slightly different (by about 0.5% in the case of var2), which suggests that the fit may be slightly unstable/the surface may be quite flat. (Data1 and Data2 are supposed to give identical answers, and do so for the starting values, but I guess the order of operations makes them give very slightly different answers for some inputs -- or I screwed up somewhere ...)

To extract an individual component from this fit, e.g. mean1, use vector indexing:

opt3$par["mean1"]  ## 27.820813

Upvotes: 7

Related Questions