Reputation: 123
I have the following data
data<-c(1L, 4L, 5L, 10L, 13L, 8L, 3L, 5L, 13L, 9L, 5L, 10L, 9L, 4L,
4L, 13L, 10L, 10L, 7L, 7L, 3L, 1L, 11L, 4L, 5L, 9L, 10L, 3L,
2L, 7L, 8L, 4L, 5L, 6L, 3L, 4L, 13L, 7L, 8L, 6L, 5L, 3L, 10L,
4L, 8L, 8L, 2L, 9L, 5L, 2L, 8L, 7L, 6L, 6L, 6L, 4L, 3L, 9L, 11L,
6L, 7L, 7L, 3L, 4L, 18L, 14L, 8L, 9L, 5L, 3L, 7L, 3L, 8L, 3L,
9L, 3L, 4L, 7L, 7L, 5L, 8L, 7L, 10L, 9L, 9L, 11L, 8L, 3L, 9L,
10L, 11L, 9L, 12L, 13L, 9L, 15L, 11L, 13L, 3L, 24L, 11L, 13L,
14L, 14L, 5L, 10L, 6L, 10L, 8L, 9L, 13L, 5L, 8L, 8L, 6L, 17L,
11L, 11L, 8L, 2L, 14L, 6L, 1L, 7L, 5L, 3L, 12L, 6L, 10L, 7L,
15L, 9L, 7L, 3L, 9L, 11L, 3L, 5L, 14L, 7L, 3L, 20L, 17L, 14L,
7L, 11L, 11L, 2L, 4L, 9L, 5L, 10L, 7L, 10L, 13L, 7L, 18L, 13L,
18L, 20L, 16L, 9L, 5L, 13L, 16L, 11L, 9L, 7L, 12L, 13L, 21L,
9L, 7L, 13L, 4L, 7L, 5L, 13L, 19L, 17L, 8L, 7L, 4L, 18L, 14L,
8L, 8L, 16L, 13L, 9L, 14L, 8L, 20L, 7L, 12L, 14L, 8L, 16L, 10L,
9L, 20L, 5L, 7L, 8L, 16L, 11L, 10L, 12L, 20L, 5L, 2L, 21L, 16L,
18L, 0L, 16L, 4L, 6L, 16L, 6L, 15L, 15L, 10L, 8L, 13L, 22L, 14L,
5L, 8L, 11L, 14L, 7L, 9L, 7L, 7L, 8L, 5L, 12L, 6L, 20L, 10L,
17L, 9L, 7L, 13L, 9L, 13L, 15L, 18L, 10L, 8L, 10L, 12L, 16L,
16L, 11L, 13L, 8L, 8L, 20L, 16L, 11L, 14L, 18L, 10L, 8L, 17L,
24L, 8L, 15L, 16L, 9L, 10L, 22L, 15L, 16L, 16L, 20L, 16L, 7L,
12L, 10L, 16L, 16L, 17L, 16L, 13L, 4L, 14L, 14L, 18L, 11L, 4L,
3L, 10L, 19L, 9L, 9L, 10L, 4L, 9L, 9L, 5L, 6L, 13L, 7L, 4L, 2L,
7L, 13L, 6L, 4L, 3L, 6L, 5L, 2L, 9L, 6L, 10L, 9L, 3L, 2L, 7L,
12L, 14L, 12L, 12L, 2L, 4L, 7L, 5L, 7L, 9L, 5L, 6L, 6L, 9L, 10L,
6L, 11L, 4L, 6L, 3L, 5L, 3L, 5L, 4L, 10L, 7L, 4L, 6L, 9L, 11L,
6L, 10L, 3L, 1L, 9L, 9L, 11L, 8L, 3L, 5L, 7L, 6L, 8L, 8L, 9L,
4L, 2L, 5L, 7L, 13L, 6L, 12L, 3L, 9L, 7L, 4L, 6L, 8L, 11L, 9L,
4L, 5L, 10L, 11L, 17L, 15L, 3L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L,
0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L,
0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 2L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 0L, 0L, 1L, 0L, 0L, 0L, 0L,
0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 2L, 1L, 2L, 0L, 1L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
3L, 16L, 17L, 6L, 6L, 9L, 6L, 12L, 6L, 13L, 6L, 5L, 9L, 6L, 14L,
2L, 17L, 4L, 10L, 6L, 1L, 15L, 8L, 8L, 5L, 7L, 7L, 8L, 12L, 2L,
3L, 7L, 11L, 6L, 9L, 10L, 11L, 11L, 4L, 12L, 1L, 7L, 6L, 3L,
8L, 11L, 7L, 6L, 5L, 5L, 11L, 7L, 7L, 6L, 7L, 5L, 7L, 10L, 5L,
4L, 7L, 5L, 9L, 7L, 14L, 10L, 4L, 9L, 5L, 10L, 12L, 14L, 6L,
5L, 12L, 5L, 3L, 8L, 8L, 4L, 9L, 9L, 12L, 2L, 8L, 5L, 4L, 5L,
1L, 4L, 4L, 7L, 6L, 8L, 10L, 13L, 9L, 4L, 8L, 8L, 9L, 12L, 4L,
7L, 6L, 5L, 5L, 7L, 2L, 5L, 10L, 0L, 4L, 6L, 5L, 3L, 8L, 2L,
1L, 1L, 6L, 6L, 1L, 2L, 5L, 9L, 10L, 7L, 10L, 3L, 12L, 7L, 4L,
1L, 5L, 6L, 6L, 5L, 4L, 1L, 5L, 0L, 8L, 6L, 4L, 1L, 7L, 5L, 3L,
8L, 3L, 0L, 3L, 2L, 0L, 6L, 10L, 0L, 8L, 3L, 0L, 1L, 1L, 5L,
7L, 0L, 1L, 0L, 3L, 1L, 9L, 2L, 8L, 1L, 0L, 0L, 5L, 1L, 0L, 2L,
1L, 0L, 7L, 1L, 2L, 0L, 0L, 4L, 4L, 10L, 0L, 6L, 4L, 3L, 0L,
4L, 1L, 3L, 1L, 0L, 0L, 0L, 5L, 0L, 6L, 6L, 3L, 5L, 0L, 4L, 0L,
2L, 3L, 5L, 2L, 4L, 3L, 1L, 1L, 0L, 2L, 0L, 3L, 0L, 3L, 4L, 4L,
7L, 0L, 0L, 1L, 9L, 0L, 3L, 0L, 4L, 0L, 3L, 4L, 5L, 0L, 0L, 4L,
3L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L,
0L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L,
0L, 0L, 0L, 1L, 0L, 0L, 1L, 2L, 0L, 0L, 0L, 2L, 0L, 0L, 0L, 0L,
1L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 2L, 0L, 0L,
0L, 0L, 2L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 13L, 10L, 13L, 10L, 11L,
8L, 27L, 8L, 12L, 20L, 15L, 9L, 10L, 3L, 8L, 13L, 16L, 13L, 12L,
13L, 10L, 14L, 14L, 10L, 10L, 7L, 13L, 12L, 12L, 23L, 7L, 12L,
6L, 7L, 10L, 8L, 13L, 16L, 10L, 11L, 18L, 7L, 15L, 18L, 10L,
9L, 15L, 4L, 3L, 9L, 12L, 2L, 6L, 4L, 4L, 8L, 4L, 7L, 11L, 9L,
7L, 9L, 15L, 7L, 7L, 14L, 15L, 6L, 3L, 7L, 6L, 22L, 7L, 8L, 6L,
12L, 7L, 11L, 10L, 6L, 10L, 6L, 5L, 16L, 11L, 11L, 6L, 9L, 10L,
4L, 14L, 7L, 6L, 4L, 9L, 4L, 7L, 10L, 11L, 8L, 6L, 7L, 3L, 8L,
8L, 12L, 7L, 13L, 5L, 4L, 10L, 6L, 8L, 7L, 11L, 3L, 3L, 5L, 4L,
4L, 11L, 3L, 3L, 3L, 3L, 7L, 4L, 5L, 3L, 5L, 1L, 5L, 2L, 5L,
6L, 6L, 4L, 3L, 6L, 7L, 3L, 8L, 1L, 3L, 5L, 9L, 9L, 10L, 6L,
9L, 7L, 5L, 5L, 10L, 6L, 9L, 2L, 6L, 6L, 1L, 6L, 4L, 5L, 3L,
3L, 3L, 3L, 3L, 2L, 6L, 1L, 5L, 3L, 4L, 9L, 3L, 8L, 5L, 7L, 5L,
10L, 5L, 4L, 0L, 8L, 6L, 4L, 6L, 7L, 4L, 3L, 1L, 3L, 3L, 6L,
5L, 7L, 3L, 7L, 2L, 2L, 6L, 4L, 3L, 3L, 2L, 2L, 4L, 2L, 5L, 5L,
7L, 3L, 5L, 2L, 2L, 1L, 5L, 1L, 3L, 2L, 5L, 3L, 1L, 4L, 0L, 1L,
4L, 3L, 2L, 2L, 2L, 6L, 3L, 4L, 2L, 2L, 8L, 4L, 3L, 6L, 6L, 2L,
4L, 11L, 3L, 4L, 4L, 5L, 5L, 1L, 5L, 2L, 7L, 3L, 2L, 4L, 2L,
3L, 6L, 3L, 11L, 7L, 5L, 9L, 5L, 6L, 5L, 9L, 6L, 5L, 7L, 1L,
14L, 7L, 7L, 7L, 2L, 5L, 5L, 9L, 2L, 9L, 2L, 6L, 2L, 9L, 4L,
3L, 4L, 9L, 7L, 6L, 5L, 4L, 5L, 6L, 4L, 5L, 2L, 5L, 4L, 7L, 3L,
9L, 6L, 9L, 7L, 2L, 7L, 6L, 7L, 3L, 4L, 8L, 3L, 8L, 10L, 3L,
3L, 5L, 4L, 8L, 6L, 5L, 4L, 5L, 1L, 6L, 6L, 8L, 9L, 5L, 10L,
1L, 8L, 7L, 7L, 6L, 5L, 1L, 5L, 8L, 11L, 2L, 6L, 7L, 6L, 5L,
20L, 8L, 10L, 7L, 5L, 2L, 5L, 3L, 17L, 6L, 5L, 0L, 1L, 1L, 9L,
1L)
I have run a ZINB model and I know that it is the best fit for my data. I want to demonstrate on a graph that this distribution is my best option. I am using fitdist
library(fitdistrplus)
library(gamlss)
nb<-fitdist(data, "nbinom")
pois<-fitdist(data, "pois")
zinb<-fitdist(data, 'ZANBI',start = list(mu = 4, sigma = 0.2))
par(mfrow = c(2, 2))
plot.legend <- c("Negative binomial", "Poisson", "ZINB")
My problem is that, just as I wanted to demonstrate that nbinom
and pois
are not the best fit, I can't do it with zero inflated poissonZIP
.
I am using gamlss
zip<-fitdist(data, 'ZIP',start = list(mu = 7.09, sigma = 4.5))
Here I'm using the values suggested in here considering mean(data[data != 0])
and var(data[data != 0])
. I always get:
Error in fitdist(data, "ZIP", start = list(mu = 7.09, sigma = 4.5)) :
the function mle failed to estimate the parameters,
with the error code 100
In addition: Warning messages:
1: In fitdist(data, "ZIP", start = list(mu = 7.09, sigma = 4.5)) :
The dZIP function should return a zero-length vector when input has length zero and not raise an error
2: In fitdist(data, "ZIP", start = list(mu = 7.09, sigma = 4.5)) :
The pZIP function should return a zero-length vector when input has length zero and not raise an error
How can I plot a ZIP of my values to demonstrate is not the best fit?
Upvotes: 0
Views: 342
Reputation: 1414
The following arguments on the ZIP fit worked for me:
sigma
< 1.mu
and sigma
set respectively to (0, Inf)
and (0, 1)
,The result of running the following code on your data
array is below, which confirms that the Zero-Inflated Negative Binomial is the best fit (based on AIC and BIC).
library(fitdistrplus)
library(gamlss)
nb<-fitdist(data, "nbinom")
pois<-fitdist(data, "pois")
zinb<-fitdist(data, 'ZANBI',start = list(mu = 4, sigma = 0.2))
zip<-fitdist(data, 'ZIP', start = list(mu = 7.09, sigma = 0.5), discrete=TRUE,
optim.method="Nelder-Mead", lower = c(0, 0), upper = c(Inf, 1))
print(nb)
print(pois)
print(zinb)
print(zip)
cdfcomp(list(nb, zinb, pois, zip))
gofstat(list(nb, zinb, pois, zip))
The only thing that worries me is that the standard error of the estimated parameters for the ZIP fit are NA
...
Partial OUTPUT
Fitting of the distribution ' nbinom ' by maximum likelihood
Parameters:
estimate Std. Error
size 1.007110 0.05297338
mu 5.548579 0.16643396
Fitting of the distribution ' pois ' by maximum likelihood
Parameters:
estimate Std. Error
lambda 5.548313 0.06522914
Fitting of the distribution ' ZANBI ' by maximum likelihood
Parameters:
estimate Std. Error
mu 6.8886199 0.1549058
sigma 0.3401722 0.0266448
Fitting of the distribution ' ZIP ' by maximum likelihood
Parameters:
estimate Std. Error
mu 7.0869552 NA
sigma 0.2171502 NA
Goodness-of-fit criteria
1-mle-nbinom 2-mle-ZANBI 3-mle-pois 4-mle-ZIP
Akaike's Information Criterion 7302.831 7141.004 10169.16 7981.985
Bayesian Information Criterion 7313.177 7151.350 10174.33 7992.331
Upvotes: 1