A strange error from an algorithm

Question

I have written this function that computes the MLE from a Cauchy distribution numerically based on the Newton-Raphson algorithm:

mlec <- function(x,theta0=median(x),numstp=100,eps=0.01){
    numfin <- numstp
    ic <- 0
    istop <- 0
    while(istop==0){
        ic <- ic+1
        ltheta <- -2*sum((x-theta0)/(1+(x-theta0)^2))
        lprimetheta <- -2*(sum(2*(x-theta0)^2/
                            (1+(x-theta0)^2)^2-1/(1+(x-theta0)^2)^2))
        theta1 <- theta0-(ltheta/lprimetheta)
        check <- abs((theta1-theta0)/theta1)
        if(check < eps ) { istop <- 1 }
        theta0 <- theta1
    }
    list(theta1=theta1,check=check,realnumstps=ic)
}

The goal is then to generate observations from a Cauchy distribution with scale parameter 2 and see how the MLE performs. The problem is that while for some samples, the MLE runs wonderfully for others I get the strange error

Error in if (check < eps) { : missing value where TRUE/FALSE needed

What is going on here? I have defined what "check" is so that shouldn't happen.

Thank you.

Ben Bolker · Accepted Answer

I've added a little bit of instrumentation (see the cat() statement in the middle), and fixed the second-derivative expression (fixed: see below):

mlec <- function(x,theta0=median(x),numstp=100,eps=0.01,
                 debug=TRUE,fixed=FALSE){
    numfin <- numstp
    ic <- 0
    istop <- 0
    while(istop==0){
        ic <- ic+1
        ltheta <- -2*sum((x-theta0)/(1+(x-theta0)^2))
        lprimetheta <- -2*(sum(2*(x-theta0)^2/
                            (1+(x-theta0)^2)^2-1/(1+(x-theta0)^2)^2))
        if (!fixed) {
            theta1 <- theta0-(ltheta/lprimetheta)
        } else theta1 <- theta0-ltheta/ff(theta0)
        check <- abs((theta1-theta0)/theta1)
        if (debug) cat(ic,ltheta,lprimetheta,theta0,theta1,check,"
")
        if(check < eps ) { istop <- 1 }
        theta0 <- theta1
    }
    list(theta1=theta1,check=check,realnumstps=ic)
}

set.seed(1)
x <- rcauchy(100,2)

mlec(x)

Here's the tail end of the output:

## ic  ltheta        lprimetheta      theta0        theta1     check
## 427 -4.48838e-75 -2.014555e-151 -4.455951e+76 -6.683926e+76 0.3333333 
## 428 -2.992253e-75 -8.953579e-152 -6.683926e+76 -1.002589e+77 0.3333333 
## 429 -1.994835e-75 -3.979368e-152 -1.002589e+77 -1.503883e+77 0.3333333 
## 430 -1.32989e-75 0 -1.503883e+77 -Inf NaN

Now, why is it happening? Either you've got a bug somewhere, or the algorithm is unstable. tl;dr it turns out the answer is actually "both"; your second-derivative expression seems wrong, but even it were correct the N-R algorithm is extremely unstable for this problem.

Here are your derivative and second-derivative functions (I'm wrapping them with Vectorize() for convenience so I can evaluate these at multiple theta values simultaneously):

lthetafun <- Vectorize(function(theta) {
    -2*sum((x-theta)/(1+(x-theta)^2))
})
lprimethetafun <- Vectorize(function(theta) {
    -2*(sum(2*(x-theta)^2/
                (1+(x-theta)^2)^2-1/(1+(x-theta)^2)^2))
})

A negative log-likelihood function based on the built-in dcauchy function:

thetafun <- Vectorize(function(theta) -sum(dcauchy(x,theta,log=TRUE)))

Check differentiation (at an arbitrarily chosen point):

library("numDeriv")
all.equal(grad(thetafun,2),lthetafun(2))  ## TRUE

Check second derivative:

hessian(thetafun,2) ## 36.13297
lprimethetafun(2)   ## 8.609859: ???

I think your second-derivative expression is wrong.

The following alternative second-derivative function is based on lazily cheating with Wolfram Alpha, differentiating your gradient function (which matches with the finite-difference approximation):

ff <- Vectorize(function(theta)
    2*sum(((x-theta)^2-1)/((x-theta)^2+1)^2))
ff(2)  ## matches hessian() above.

But it looks like you may have further problems.

The negative log-likelihood surface looks OK:

curve(thetafun, from=-10,to=10,n=501)

enter image description here

But trouble is on the horizon:

curve(lthetafun, from=-10,to=10, n=501)

enter image description here

This looks irregular, and going up one level to the second derivative shows that it is:

curve(ff, from=-10, to=10, n=501)

enter image description here

Here's the curve of N-R updates:

ff2 <- function(x) x-lthetafun(x)/ff(x)
curve(ff2, from=-10, to=10, n=501,ylim=c(-100,100))

enter image description here

Yikes! This indicates why the Newton-Raphson method could go wrong unless you start close enough to the minimum (any time the likelihood surface has an inflection point, the N-R updating curve will have a pole ...). Further analysis of the problem would probably tell you why the second derivative of the Cauchy is so scary.

If you just want to find the MLE you can do it by some more robust 1-D method:

library("bbmle")
mle2(x~dcauchy(location=m),
     data=data.frame(x),
     start=list(m=median(x)),
     method="Brent",
     lower=-100,upper=100)
## 
## Call:
## mle2(minuslogl = x ~ dcauchy(location = m), start = list(m = median(x)), 
##     method = "Brent", data = data.frame(x), lower = -100, upper = 100)
## 
## Coefficients:
##       m 
## 1.90179 
## 
## Log-likelihood: -262.96 
##

If you start close enough, N-R seems to work OK:

  mlec(x,1.85,debug=FALSE,fixed=TRUE,eps=0.0001)
 ## $theta1
 ## [1] 1.901592
 ## 
 ## $check
 ## [1] 5.214763e-05
 ## 
 ## $realnumstps
 ## [1] 37079

A strange error from an algorithm

Answers (1)

Related Questions