Reputation: 1039
I have written this function that computes the MLE from a Cauchy distribution numerically based on the Newton-Raphson algorithm:
mlec <- function(x,theta0=median(x),numstp=100,eps=0.01){
numfin <- numstp
ic <- 0
istop <- 0
while(istop==0){
ic <- ic+1
ltheta <- -2*sum((x-theta0)/(1+(x-theta0)^2))
lprimetheta <- -2*(sum(2*(x-theta0)^2/
(1+(x-theta0)^2)^2-1/(1+(x-theta0)^2)^2))
theta1 <- theta0-(ltheta/lprimetheta)
check <- abs((theta1-theta0)/theta1)
if(check < eps ) { istop <- 1 }
theta0 <- theta1
}
list(theta1=theta1,check=check,realnumstps=ic)
}
The goal is then to generate observations from a Cauchy distribution with scale parameter 2 and see how the MLE performs. The problem is that while for some samples, the MLE runs wonderfully for others I get the strange error
Error in if (check < eps) { : missing value where TRUE/FALSE needed
What is going on here? I have defined what "check" is so that shouldn't happen.
Thank you.
Upvotes: 3
Views: 160
Reputation: 226332
I've added a little bit of instrumentation (see the cat()
statement in the middle), and fixed the second-derivative expression (fixed
: see below):
mlec <- function(x,theta0=median(x),numstp=100,eps=0.01,
debug=TRUE,fixed=FALSE){
numfin <- numstp
ic <- 0
istop <- 0
while(istop==0){
ic <- ic+1
ltheta <- -2*sum((x-theta0)/(1+(x-theta0)^2))
lprimetheta <- -2*(sum(2*(x-theta0)^2/
(1+(x-theta0)^2)^2-1/(1+(x-theta0)^2)^2))
if (!fixed) {
theta1 <- theta0-(ltheta/lprimetheta)
} else theta1 <- theta0-ltheta/ff(theta0)
check <- abs((theta1-theta0)/theta1)
if (debug) cat(ic,ltheta,lprimetheta,theta0,theta1,check,"\n")
if(check < eps ) { istop <- 1 }
theta0 <- theta1
}
list(theta1=theta1,check=check,realnumstps=ic)
}
set.seed(1)
x <- rcauchy(100,2)
mlec(x)
Here's the tail end of the output:
## ic ltheta lprimetheta theta0 theta1 check
## 427 -4.48838e-75 -2.014555e-151 -4.455951e+76 -6.683926e+76 0.3333333
## 428 -2.992253e-75 -8.953579e-152 -6.683926e+76 -1.002589e+77 0.3333333
## 429 -1.994835e-75 -3.979368e-152 -1.002589e+77 -1.503883e+77 0.3333333
## 430 -1.32989e-75 0 -1.503883e+77 -Inf NaN
Now, why is it happening? Either you've got a bug somewhere, or the algorithm is unstable. tl;dr it turns out the answer is actually "both"; your second-derivative expression seems wrong, but even it were correct the N-R algorithm is extremely unstable for this problem.
Here are your derivative and second-derivative functions (I'm wrapping them with Vectorize()
for convenience so I can evaluate these at multiple theta
values simultaneously):
lthetafun <- Vectorize(function(theta) {
-2*sum((x-theta)/(1+(x-theta)^2))
})
lprimethetafun <- Vectorize(function(theta) {
-2*(sum(2*(x-theta)^2/
(1+(x-theta)^2)^2-1/(1+(x-theta)^2)^2))
})
A negative log-likelihood function based on the built-in dcauchy
function:
thetafun <- Vectorize(function(theta) -sum(dcauchy(x,theta,log=TRUE)))
Check differentiation (at an arbitrarily chosen point):
library("numDeriv")
all.equal(grad(thetafun,2),lthetafun(2)) ## TRUE
Check second derivative:
hessian(thetafun,2) ## 36.13297
lprimethetafun(2) ## 8.609859: ???
I think your second-derivative expression is wrong.
The following alternative second-derivative function is based on lazily cheating with Wolfram Alpha, differentiating your gradient function (which matches with the finite-difference approximation):
ff <- Vectorize(function(theta)
2*sum(((x-theta)^2-1)/((x-theta)^2+1)^2))
ff(2) ## matches hessian() above.
But it looks like you may have further problems.
The negative log-likelihood surface looks OK:
curve(thetafun, from=-10,to=10,n=501)
But trouble is on the horizon:
curve(lthetafun, from=-10,to=10, n=501)
This looks irregular, and going up one level to the second derivative shows that it is:
curve(ff, from=-10, to=10, n=501)
Here's the curve of N-R updates:
ff2 <- function(x) x-lthetafun(x)/ff(x)
curve(ff2, from=-10, to=10, n=501,ylim=c(-100,100))
Yikes! This indicates why the Newton-Raphson method could go wrong unless you start close enough to the minimum (any time the likelihood surface has an inflection point, the N-R updating curve will have a pole ...). Further analysis of the problem would probably tell you why the second derivative of the Cauchy is so scary.
If you just want to find the MLE you can do it by some more robust 1-D method:
library("bbmle")
mle2(x~dcauchy(location=m),
data=data.frame(x),
start=list(m=median(x)),
method="Brent",
lower=-100,upper=100)
##
## Call:
## mle2(minuslogl = x ~ dcauchy(location = m), start = list(m = median(x)),
## method = "Brent", data = data.frame(x), lower = -100, upper = 100)
##
## Coefficients:
## m
## 1.90179
##
## Log-likelihood: -262.96
##
If you start close enough, N-R seems to work OK:
mlec(x,1.85,debug=FALSE,fixed=TRUE,eps=0.0001)
## $theta1
## [1] 1.901592
##
## $check
## [1] 5.214763e-05
##
## $realnumstps
## [1] 37079
Upvotes: 7