Reputation: 2556

Understanding the maxit argument of optim() in R

In the following call to optim() I would expect one evaluation of fn() and one of gr(), because maxit=1. However, fn() and gr() are evaluated 7 times each.

optim(par=1000, fn=function(x) x^2, gr=function(x) 2*x,
      method="L-BFGS-B", control=list(maxit=1))$counts
function gradient 
       7        7

Why is this? Is it a bug? Or why does optim() do 7 evaluations for one iteration?

More detailed output:

optim(par=1000,
      fn=function(x) { cat("f(", x, ")", sep="", fill=TRUE); x^2 },
      gr=function(x) { cat("g(", x, ")", sep="", fill=TRUE); 2*x },
      method="L-BFGS-B", control=list(maxit=1))$counts
f(1000)
g(1000)
f(999)
g(999)
f(995)
g(995)
f(979)
g(979)
f(915)
g(915)
f(659)
g(659)
f(1.136868e-13)
g(1.136868e-13)
function gradient 
       7        7

(Tested with R version 3.5.0.)

Upvotes: 6

Answers (3)

paoloeusebi

Reputation: 1086

Here you can find a good explanation.

https://stat.ethz.ch/pipermail/r-devel/2010-August/058081.html

The key point is that the function is evaluated more than once during an iteration. You can see that increasing maxit to 2 results in another function evaluation.

Upvotes: 1

twedl

Reputation: 1648

Docs:

See https://stat.ethz.ch/R-manual/R-devel/library/stats/html/optim.html for more info on how you can tell:

convergence 
An integer code. 0 indicates successful completion (which is always the case for "SANN" and "Brent"). Possible error codes are

1      indicates that the iteration limit maxit had been reached.

Running your code (but looking at convergence instead of counts), I get:

> optim(par=1000,
+       fn=function(x) { cat("f(", x, ")", sep="", fill=TRUE); x^2 },
+       gr=function(x) { cat("g(", x, ")", sep="", fill=TRUE); 2*x },
+       method="L-BFGS-B", control=list(maxit=1))$convergence
f(1000)
g(1000)
f(999)
g(999)
f(995)
g(995)
f(979)
g(979)
f(915)
g(915)
f(659)
g(659)
f(1.136868e-13)
g(1.136868e-13)
[1] 1

So it ran one iteration and stopped, returning convergence = 1. Another key is in the counts description, which says:

counts  
A two-element integer vector giving the number of calls to fn and gr respectively. 
This excludes those calls needed to compute the Hessian, if requested, and any calls 
to fn to compute a finite-difference approximation to the gradient.

Implying it calls it a bunch of times to figure out what's happening. You can look in the c code to figure out how many times each method will call your function.

Upvotes: 1

Ben Bolker

Reputation: 226712

An iteration is one iteration of the optimization algorithm. A function evaluation is a single call to the objective function. How many function evaluations are required for each iteration will depend on:

what algorithm is being used (e.g. Nelder-Mead vs BFGS vs. ...)
how one iteration step works
- e.g. for Nelder-Mead an iteration comprises (1) reflection; (2) [maybe] expansion; (3) [maybe] contraction; (4) [maybe] shrinkage; there's always one evaluation (reflection), but the other steps are conditional on what happens in the first sub-step
- for L-BFGS-B I think a line search is involved ...
whether derivatives need to be computed by finite differences

For what it's worth,nlminb allows separate control of max iterations and max evaluations:

‘eval.max’ Maximum number of evaluations of the objective function allowed. Defaults to 200.
‘iter.max’ Maximum number of iterations allowed. Defaults to 150.

Upvotes: 5

Understanding the maxit argument of optim() in R

Answers (3)

Docs:

Related Questions