Ernest A
Ernest A

Reputation: 7839

Accuracy of maximum likelihood estimators

Here's a test for comparing ML estimators of the lambda parameter of a Poisson distribution.

with(data.frame(x=rpois(2000, 1.5), i=LETTERS[1:20]),
     cbind(cf=tapply(x, i, mean),
           iter=optim(rep(1, length(levels(i))), function(par) 
             -sum(x * log(par[i]) - par[i]), method='BFGS')$par))

The first column shows the ML estimator obtained from the closed-form solution (for reference), while the second column shows the ML estimator obtained by maximizing a log-likelihood function using the BFGS method. Results:

    cf     iter
A 1.38 1.380054
B 1.61 1.609101
C 1.49 1.490903
D 1.47 1.468520
E 1.57 1.569831
F 1.63 1.630244
G 1.33 1.330469
H 1.63 1.630244
I 1.27 1.270003
J 1.64 1.641064
K 1.58 1.579308
L 1.54 1.540839
M 1.49 1.490903
N 1.50 1.501168
O 1.69 1.689926
P 1.52 1.520876
Q 1.48 1.479891
R 1.64 1.641064
S 1.46 1.459310
T 1.57 1.569831

It can be seen the estimators obtained with the iterative optimization method can deviate quite a lot from the correct value. Is this something to be expected or is there another (multi-dimensional) optimization technique that would produce a better approximation?

Upvotes: 4

Views: 1377

Answers (2)

Ernest A
Ernest A

Reputation: 7839

Answer provided by Chase:

the reltol parameter which gets passed to control() lets you adjust the threshold of the convergence. You can try playing with that if necessary.

Edit:

This is a modified version of the code now including the option reltol=.Machine$double.eps, which will give the greatest possible accuracy:

with(data.frame(x=rpois(2000, 1.5), i=LETTERS[1:20]),
     cbind(cf=tapply(x, i, mean),
           iter=optim(rep(1, length(levels(i))), function(par) 
             -sum(x * log(par[i]) - par[i]), method='BFGS',
             control=list(reltol=.Machine$double.eps))$par))

And the result is:

    cf iter
A 1.65 1.65
B 1.54 1.54
C 1.80 1.80
D 1.44 1.44
E 1.53 1.53
F 1.43 1.43
G 1.52 1.52
H 1.57 1.57
I 1.61 1.61
J 1.34 1.34
K 1.62 1.62
L 1.23 1.23
M 1.47 1.47
N 1.18 1.18
O 1.38 1.38
P 1.44 1.44
Q 1.66 1.66
R 1.46 1.46
S 1.78 1.78
T 1.52 1.52

So, the error made by the optimization algorithm (ie. the difference between cf and iter) is now reduced to zero.

Upvotes: 7

Greg Snow
Greg Snow

Reputation: 49640

In addition to setting the reltol argument, also consider that you were really doing a bunch of optimizations across one parameter, the optimize function works better than optim for single parameter cases, that may work better for your real problem (if it is really one dimensional).

Upvotes: 1

Related Questions