Charlotte Taelman
Charlotte Taelman

Reputation: 31

Zero-truncated Poisson model in INLA

I am working on a spatial SDM using INLA. My data has a Poisson distribution with no zeros. I want to apply a zero-truncated INLA Poisson model but I do not understand the meaning of the parameter E in the example code provided by INLA.

I have the code below already:

interval70 = c(min(strandline70$abundance), max(strandline70$abundance))
  N70 = nrow(strandline70)
  X70 = data.frame(Intercept = rep(1, N70),
                sand_mean = strandline70$sand_mean,
                 spring_mean = strandline70$spring_mean)
  StackFit = inla.stack(tag = "Fit",     
                        data = list(y = strandline70$abundance),
                        A = list(A3, 1),
                        effects = list(w3.index,    # fit has spatial weight
                                       X70))          # covariates

  FIT = inla.stack(StackFit)
  f = y ~ -1 + 
    Intercept + 
    sand_mean + spring_mean  + 
    f(w, model = spde)
  
  
  inla_sp = inla(f, 
                  family = "cenpoisson",
                  control.family = list(cenpoisson.I = interval70), # this interval is from 1 to 25
                  data = inla.stack.data(StackFit),
                  control.compute = list(waic=T),
                  control.predictor = list(A = inla.stack.A(StackFit)))

In example code, E is required as an additional parameter. (see pdf text. Can someone explain to me what this means and how to implement it in your own code?

The code from the pdf:

n=100
a = 0
b = 1
x = rnorm(n, sd = 0.5)
eta = a + b*x
interval = c(1, 4)
E = sample(1:10, n, replace=TRUE)
lambda = E*exp(eta)
y = rpois(n, lambda = lambda)
censored = (y >= interval[1] & y <= interval[2])
y[censored] = interval[1]

r = (inla(y ~ 1 + x,
family = "cenpoisson",
control.family = list(cenpoisson.I = interval),
data = data.frame(y, x),
E=E)) # this I don't understand



Upvotes: 1

Views: 211

Answers (1)

Martin C. Arnold
Martin C. Arnold

Reputation: 9668

Late reply, but maybe still useful to you or others:

E is for exposure: in Poisson regression, the interest often lies in modeling the rate (events per unit of exposure) rather than just the count of events. This is relevant in many applications where the expected number of events is proportional not just to the effect of the predictors but also to the amount of exposure when the exposure varies between units. For more mathematical details, I suggest consulting cross-validated. See, e.g., this question.

In a spatial SDM, you likely have different exposure across spatial units, depending on what you are modeling. For example, climate conditions, elevation etc.

In your example, different exposures affecting the rate are reflected by

E = sample(1:10, n, replace = TRUE) # generates random exposure values
lambda = E * exp(eta)               # rate param is proportional to E

i.e., there is different exposure across the 10 units, which should be accounted for in the model by including (the observed variable) E as an offset term in the log link function between lambda and the linear predictor eta. This is what the E argument of inla() is for.

Upvotes: 0

Related Questions