Mercier
Mercier

Reputation: 101

Asymptotic Variance of maximum likelihood estimator with optim in R

I simulated 100 observations from a gamma density:

x <- rgamma(100,shape=5,rate=5)

I try to obtain the asymptotic variance of the maximum likelihood estimators with the optim function in R. To do so, I calculated manually the expression of the loglikelihood of a gamma density and and I multiply it by -1 because optim is for a minimum. Here is my code:

min <- function(data, par) {
    with(data, 1/par[2]*sum(x)-(par[1]-1)*sum(log(x))+n*(log(gamma(par[1]))+par[1]*log(par[2])))
}

mle <- optim(par=c(0,1),min,method='BFGS', hessian=TRUE)

AV <- 1 / mle$hessian

However, I obtain the following error(because of the 2nd line of code) :

Error in eval(substitute(expr), data, enclos = parent.frame()) : 
    numeric 'envir' arg not of length one 

Is there a way to solve this problem? Any hints would be appreciate. Thank you.

P.S: I know that I can find the asymptotic variance with:

library(MASS)

AV <- (fitdistr(x, "gamma", start=list(shape=1, rate=1))$sd)^2

But that's not the way that I am looking for.

Upvotes: 1

Views: 1955

Answers (2)

Ben Bolker
Ben Bolker

Reputation: 226871

@tguzella's answer is very thorough and pedagogical. For convenience, you can also do this:

set.seed(101)
x <- rgamma(100,shape=5,rate=5)
library("bbmle")
m1 <- mle2(x~dgamma(shape,rate=rate),
           start=list(shape=2,rate=2), ## anything reasonable
           data=data.frame(x)  ## data must be specified as a data frame
      )

We get a warning message, which is harmless in this case but which we can alleviate as follows:

m2 <- update(m1,lower=c(0,0),
             method="L-BFGS-B")

Now we can easily get the point estimates and asymptotic variance-covariance matrix:

coef(m2)
vcov(m2)

Note:

  • bbmle::mle2 is an extension of stats4::mle, which should also work for this problem (mle2 has a few extra bells and whistles and is a little bit more robust), although you would have to define the log-likelihood function as something like:
 nLL <- function(shape,rate) {
     -sum(dgamma(x,shape,rate=rate,log=TRUE))
 }
  • in general it's a good idea to use the built-in distribution functions such as dgamma(), when they're suitable for your problem; they are well-tested, robust, and it makes the code easier to read.

Upvotes: 2

tguzella
tguzella

Reputation: 1481

Ok, let's go by parts.

First of all, your expression for the minus log likelihood is incorrect, as it is actually parameterized as a function of the shape and scale parameters. Since you sampled the data by defining the shape and rate parameters, it is simpler to maintain consistency. See https://en.wikipedia.org/wiki/Gamma_distribution

Second, the specification of the function for optim is incorrect; the documentation explicitly mentions that in the function "first argument the vector of parameters over which minimization is to take place". In the code below, I left the data as a global variable that is accessed by the function to be minimized.

Third, in such a scenario it is advisable to enforce the constraints on the two parameters being estimated, otherwise the fitting may fail in some cases, depending on the sampled data.

Finally, the calculation of the asymptotic variance is incorrect: you need to invert the Fisher's information matrix; see, e.g., https://stats.stackexchange.com/questions/68080/basic-question-about-fisher-information-matrix-and-relationship-to-hessian-and-s

Here is the code:

# Data
x <- rgamma(100, shape = 5, rate = 5)

# Minus log likelihood function
minus_log_L_fun <- function(params) {
    a <- params[1] # shape
    b <- params[2] # rate (1 / scale)

    n <- length(x) # sample size

    log_L <- (a - 1) * sum(log(x)) - b * sum(x) + n * a * log(b) - n * log(gamma(a))    
    return (-log_L)
}

# Impose constraints on the estimates of the shape and rate parameters: both being strictly positive
# Use algorithm 'L-BFGS-B', since it allows for box constraints
mle <- optim(par = c(1, 1), minus_log_L_fun, method = "L-BFGS-B", lower = c(1e-10, 1e-10), upper = c(100, 100), hessian = TRUE)

# Retrieve the point estimates
shape_fit <- mle$par[1]
rate_fit <- mle$par[2]

# Fisher Information Matrix (equal to the Hessian, since minimizing minus log likelihood)
I <- mle$hessian
# Obtain the asymptotic variances (need to invert the FIM)
var_Theta <- diag(solve(I))

cat(sprintf("Results:\n"))
cat(sprintf("Point estimates: shape = %g, rate = %g\n", shape_fit,     rate_fit))
cat(sprintf("Asymptotic ML variances: shape = %g, rate = %g\n", var_Theta[1], var_Theta[2]))

producing, for a particular run:

Results:
Point estimates: shape = 4.25661, rate = 4.08384
Asymptotic ML variances: shape = 0.336318, rate = 0.34875

Using the MASS package to confirm:

library(MASS)

res <- fitdistr(x, "gamma", start = list(shape = 1, rate = 1))
MASS_PE <- res$estimate
MASS_AV <- (res$sd)^2

cat(sprintf("From the MASS package:\n"))
cat(sprintf("Point estimates:\n"))
print(MASS_PE)
cat(sprintf("Asymptotic variances:\n"))
print(MASS_AV)

leads to:

From the MASS package:
Point estimates:
   shape     rate 
4.256613 4.083836 
Asymptotic variances:
> print(MASS_AV)
    shape      rate 
0.3363179 0.3487502

Upvotes: 5

Related Questions