Pepacz
Pepacz

Reputation: 959

R optim - retrieving values from individual iterations

I want to retrieve values of optimized parameters in different iterations of the optim function. The purpose is to check convergence of the error on the validation dataset.

My question is closely related to this question and I tried to implement the code therein that I supposed to solve my problem. However, both values of i and $count suggest that the optimized function is called much more times than specified in the maxit parameter:

vals <- list()
f1 <- function(x) {
  i <<- i+1
  vals[[i]] <<- x
  
  x1 <- x[1]
  x2 <- x[2]
  x1^2 + 3*x2^2  
}

# countBFGS 
i <- 0  
optim(c(1,1), f1, method="BFGS",control = list(trace=1, maxit = 10))$count
i
# countCG 
i <- 0 
optim(c(1,1), f1, method="CG",control = list(trace=1, maxit = 10))$count
i
# countSANN
i <- 0 
optim(c(1,1), f1, method="SANN",control = list(trace=1, maxit = 10))$count
i

Any suggestions how to capture optimized parameters on-the-fly?

Sorry for possible duplication of the question mentioned above.

Upvotes: 1

Views: 4469

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 269471

The difference in counts that is observed is due to the fact that the objective function will also be called to compute the numeric derivative. If we supply a derivative then that won't happen and the count and i will correspond. In the example below they are both 24:

vals <- NULL; i <- 0
gr1 <- function(x) c(2, 6) * x  # gradient
optim(c(1, 1), f1, gr1, method = "BFGS", control = list(trace = 1))$count

## initial  value 4.000000 
## final  value 0.000000 
## converged
## function gradient 
##       24        9 

i
## [1] 24

Also if we use an optimization method that does not use derivatives in the first place such as Nelder Mead then the count and i will also correspond. Try this:

vals <- NULL; i <- 0
optim(c(1, 1), f1, method = "Nelder", control = list(trace = 1))$count
i

ADDED: If using maxit then try tracing the f1 and gr1 functions. gr1 will be evaluated maxit times and the last valuation of f1 prior to each gr1 evaluation could be used to monitor f1.

vals <- NULL; i <- 0
gr1 <- function(x) c(2, 6) * x  # gradient
trace(gr1, exit = quote(print(c(returnValue(), x))))
trace(f1, exit = quote(print(c(i, returnValue(), x))))
optim(c(1, 1), f1, gr1, method = "BFGS", control = list(trace = 10, maxit = 5))$count
untrace(f1)
untrace(gr1)

giving:

Tracing fn(par, ...) on exit 
[1] 1 4 1 1
initial  value 4.000000 
Tracing gr(par, ...) on exit 
[1] 2 6 1 1
Tracing fn(par, ...) on exit 
[1]  2 76 -1 -5
Tracing fn(par, ...) on exit 
[1]  3.00  0.48  0.60 -0.20
Tracing gr(par, ...) on exit 
[1]  1.2 -1.2  0.6 -0.2
Tracing fn(par, ...) on exit 
[1]  4.00000000  0.55976676 -0.73469388  0.08163265
Tracing fn(par, ...) on exit 
[1]  5.0000000  0.1728560  0.3330612 -0.1436735
Tracing gr(par, ...) on exit 
[1]  0.6661224 -0.8620408  0.3330612 -0.1436735
Tracing fn(par, ...) on exit 
[1] 6.000000e+00 1.207714e-05 1.192941e-03 1.884501e-03
Tracing gr(par, ...) on exit 
[1] 0.002385882 0.011307005 0.001192941 0.001884501
Tracing fn(par, ...) on exit 
[1]  7.000000e+00  7.788526e-09 -5.338595e-05 -4.057284e-05
Tracing gr(par, ...) on exit 
[1] -1.067719e-04 -2.434371e-04 -5.338595e-05 -4.057284e-05
final  value 0.000000 
stopped after 5 iterations
function gradient 
       7        5 

Upvotes: 4

Related Questions