Reputation: 25
I am working on a count data and, trying several different Poisson Fixed Effects Regression Models by using zeroinfl (from pscl package) and pglm (from pglm package) for not zero inflated models. However, my R code runs very slow and it takes more than 9-10 hours. For clarification, I am adding fixed effects manually by adding time and ID dummies.
model<- zeroinfl(y~ x1+ x2+ x3+ x4 + as.factor(time)
+ as.factor(ID) | 1, data = df, dist = "poisson")
I am aware of that question: R Zeroinfl model. However, my data is highly zero inflated with mean 0.587 and median equals to 0 and I am afraid this feature of the data can be lost by suggested methods. I am kind of new to R. Any help is appreciated.
Upvotes: 0
Views: 432
Reputation: 227061
Given what you've said so far, it may be worth trying
library(glmmTMB)
model <- glmmTMB(y~ x1+ x2+ x3+ x4 + as.factor(time)
+ as.factor(ID),
dispformula = ~ 1,
data = df,
family = "poisson",
sparseX = c(cond = TRUE))
You can do whatever you like with the zero-inflation component (e.g. dispformula = ~ x1 + x2 + x3 + x4
to include those covariates). If you want the zero-inflated model matrix to be sparse as well, add zi = TRUE
to the sparseX
vector.
The reason (particularly for the sparseX
) is that generating the model matrix for a data set with 87K rows and 2500 IDs with zeroinfl
will (I think) create a model matrix that is approximately 2500*87e3*8/2^30 = 1.620501
gigabytes ...
Upvotes: 3