Thon
Thon

Reputation: 1

Why isn't the optim() function returning the correct estimates?

I have tried simulating a process by inputting a model and its parameters for alpha0, alpha1 and beta 1.

The model uses a poisson distribution conditional on itself to get the next value. I have tried simulating with a sample size of 200 and using 100 samples or repetitions.

a0<-5
a1<-0.9
b1<-0.2

l<-rep(1,200)
xs<-rep(0,200)
y<-rep(0,200)
s<-matrix(nrow=100, ncol=200)


xs[1]<-0
l[1]<-1

for (j in 1: 100){
  for (i in 2: 50)
  {
    l[i]<-a0+a1*xs[i-1]+b1*l[i-1]
    xs[i]<-rpois(1,lambda = l[i])
  }
  s[j,1:200]<-xs
}

And minimising the negative log likelihood using the optim functions as to get the 100 estimates for the initial parameters of alpha 0, alpha 1, and beta 1. Specifally their estimate alongside their standard deviation.However the optim funtion isn't working

loglik<-function(theta,x)
{  
  alpha0<-theta[1];
  alpha1<-theta[2];
  beta1<-theta[3]
  
  #lambda
  T<-length(x);
  
  lambda<-rep(1,T);
  likeli<-rep(1,T);
  
  for(t in (2:T))
  {
    
    lambda[t]<-alpha0+alpha1*x[t-1]+beta1*lambda[t-1]; 
    likeli[t]<-((lambda[t]^x[t])*exp(-lambda[t]))/factorial(x[t])
  }
  
  return(-log(prod(likeli)))
}

estimates<-matrix(nrow=100, ncol=3)

for(i in 1:100){
  initial<-c(6,0.8,1)
  res<-optim(initial,loglik,x=s[i,],control=list(maxit=10000),hessian=T)
  estimates[i,] <- res$par
  
}

mean(estimates[,1])
sqrt(var(estimates[,1]))
mean(estimates[,2])
sqrt(var(estimates[,2]))
mean(estimates[,3])
sqrt(var(estimates[,3]))

An its giving me an error that it could evaluate at the initial parameters

Error in optim(initial, loglik, x = s[i, ], control = list(maxit = 10000),  : 
  function cannot be evaluated at initial parameters

Where is the problem and how do I get rid of this?

Upvotes: 0

Views: 313

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226597

This doesn't completely solve your problem, but it gets you partway there:

  • compute the log-likelihood directly using dpois() (this avoids under/overflow and other possible errors)
  • use tmax and tt instead of T and t (it's best practice to avoid names of built-in variables)
  • get rid of semicolons (cosmetic)
loglik <- function(theta,x) {  
  alpha0 <- theta[1]
  alpha1 <- theta[2]
  beta1 <- theta[3]

  
  #lambda
  tmax <- length(x)  
  lambda <- rep(1,tmax)
  llik <- rep(0,tmax)
  
  for(tt in 2:tmax) { 
    lambda[tt] <- alpha0+alpha1*x[tt-1]+beta1*lambda[tt-1]
    llik[tt] <- dpois(x[tt], lambda[tt], log = TRUE)
  }
  
  cat(alpha0, alpha1, beta1, -sum(llik), "\n")
  return(-sum(llik))
}

Now when I run loglik(initial, s[1,]) I at least get a finite value ...

When I run

optim(initial,loglik,x=s[1,],control=list(maxit=10000))

(without hessian=TRUE) I do get an answer; trying to get the Hessian gives an error about non-finite finite differences. Some possible ways forward:

  • find some way to avoid predicting negative values for the Poisson mean, e.g. bound your parameter search (using lower and method = "L-BFGS-B")
  • Your simulations appear only to be going to time 50, leaving a string of 150 zeros at the end - that could definitely be making things wonky

Upvotes: 0

Related Questions