millie0725
millie0725

Reputation: 393

Back transforming pscl hurdle model output in R

Here's the setup of my hurdle model from the pscl package:

m1 <- hurdle(p_eaten ~ state + species + year, data = herb, dist = "negbin", 
                 zero.dist = "binomial")

Where p_eaten is the amount of a leaf eaten (from 0-100), state is the climate the plant was grown in (warmed or control), species is the species of the measured plant, and year is the year the measurements were taken. Here's the output of the model:

summary(m1)
Pearson residuals:
    Min      1Q  Median      3Q     Max 
-0.6266 -0.4131 -0.2269 -0.1505 10.6787 

Count model coefficients (truncated negbin with log link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  0.73960    0.21122   3.502 0.000463 ***
statewarmed -0.33980    0.11186  -3.038 0.002384 ** 
speciesEugr -0.43631    0.19803  -2.203 0.027581 *  
speciesHisp -0.23574    0.26934  -0.875 0.381447    
speciesHype -0.34914    0.65731  -0.531 0.595307    
speciesPhpr  0.93623    0.28083   3.334 0.000857 ***
speciesPopr  1.14931    0.28587   4.020 5.81e-05 ***
speciesSoca -0.01586    0.15211  -0.104 0.916955    
year         0.37092    0.04463   8.312  < 2e-16 ***
Log(theta)  -0.78576    0.13894  -5.655 1.56e-08 ***
Zero hurdle model coefficients (binomial with logit link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  0.53933    0.15946   3.382 0.000719 ***
statewarmed -0.21288    0.10165  -2.094 0.036251 *  
speciesEugr -0.15954    0.19967  -0.799 0.424276    
speciesHisp -1.93075    0.22196  -8.699  < 2e-16 ***
speciesHype -1.32312    0.52157  -2.537 0.011187 *  
speciesPhpr -0.90139    0.24752  -3.642 0.000271 ***
speciesPopr -2.28887    0.22930  -9.982  < 2e-16 ***
speciesSoca -0.09080    0.15905  -0.571 0.568069    
year        -0.15437    0.03305  -4.670 3.01e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Theta: count = 0.4558
Number of iterations in BFGS optimization: 23 
Log-likelihood: -3300 on 19 Df

In the output, I understand that the count model fits a model to the non-zero observations, while the zero hurdle model calculates the probability that an observation is not zero.

So in this case, I believe its showing that warmed plants (state: warmed) have a significantly lower chance of being eaten (p=0.036), and upon being eaten, also experience less amounts of herbivory (p=0.0024).

However, how do I backtransform these models into effect sizes? Here's what I thought was correct:

# calculating effect size for count model - accounting for log link
exp(0.73960 + -0.33980*0) # 2.095097 control plants
exp(0.73960 + -0.33980*1) # 1.491526 warmed plants
# effect:
1.491526 - 2.095097 # 0.603571 less herbivory on warmed plants (compared to control)

# calculating effect size of zero hurdle model - accounting for logit link
exp(0.53933 + -0.21288*0)/(1+exp(0.53933 + -0.21288*0)) # 0.6316565 control plants
exp(0.53933 + -0.21288*1)/(1+exp(0.53933 + -0.21288*1)) # 0.5808954 warmed plants
# effect:
0.5808954 - 0.6316565 # -0.0507611, so a 5 % lesser chance of experiencing herb. for warmed plants

Where for the count model, the values back transform into the original units of the data.

But, my main question is: does the zero hurdle model back transform into a percentage? In my example above, the difference between warmed and control plants was 0.05, so is the probability of warmed plants being eaten at all 5% less than control plants? Or am I wrong to be thinking of this as a percentage? Thanks!

Using R version 4.0.2, Mac OS X 10.13.6

Upvotes: 1

Views: 118

Answers (0)

Related Questions