Reputation: 41
I am doing negative binomial analysis for some count data in the following link:https://www.dropbox.com/s/q7fwqicw3ebvwlg/stackquestion.csv?dl=0
I had some problems (error messages) when I tried to fit all the independent variables into the model, which makes me want to look at each independent variables one by one to find out which variable caused the problem. Here is what I found:
For all the other variables, when I fit the variables to the Y which is column A looks normal:
m2 <- glm.nb(A~K, data=d)
summary(m2)
Call:
glm.nb(formula = A ~ K, data = d, init.theta = 0.5569971932,
link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.5070 -1.2538 -0.4360 0.1796 1.9588
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.66185 0.84980 -0.779 0.436
K 0.25628 0.03016 8.498 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(0.557) family taken to be 1)
Null deviance: 113.202 on 56 degrees of freedom
Residual deviance: 70.092 on 55 degrees of freedom
AIC: 834.86
Number of Fisher Scoring iterations: 1
Theta: 0.5570
Std. Err.: 0.0923
2 x log-likelihood: -828.8570
However, I found this variable L, when I fit L to the Y, I got this:
m1 <- glm.nb(A~L, data=d)
There were 50 or more warnings (use warnings() to see the first 50)
summary(m1)
Call:
glm.nb(formula = A ~ L, data = d, init.theta = 5136324.722, link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-67.19 -18.93 -12.07 13.25 64.00
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 3.45341 0.01796 192.3 <2e-16 ***
L 0.24254 0.00103 235.5 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(5136325) family taken to be 1)
Null deviance: 97084 on 56 degrees of freedom
Residual deviance: 28529 on 55 degrees of freedom
AIC: 28941
Number of Fisher Scoring iterations: 1
Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L, :
invalid 'nsmall' argument
You can see that the init.theta and AIC is too large, and there are 50 warning and an error message.
The warning message is this
In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
Actually, variables M and L are two observations of one thing. I did not find anything abnormal with variable L. For all the data, only column L has this problem.
So I am wondering what exactly does this error message mean: Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L,: invalid 'nsmall' argument. Since I just observed these data, how should I fix this error? Thank you!
Upvotes: 0
Views: 5473
Reputation: 21274
The important message is in the warnings()
: when L
is the independent variable, the default number of iterations in the GLM convergence procedure is not high enough to converge on a model fit.
If you manually set the maxit
parameter to a higher value, you can fit A ~ L
without error:
glm.nb(A ~ L, data = d, control = glm.control(maxit = 500))
See the glm.control
documentation for more. Note that you can also set a reasonable value for init.theta
- and this will prevent both theta and AIC from fitting to unreasonable values:
m1 <- glm.nb(A ~ L, data = df, control = glm.control(maxit = 500), init.theta = 1.0)
Output:
Call:
glm.nb(formula = A ~ L, data = df, control = glm.control(maxit = 500),
init.theta = 0.8016681349, link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-3.3020 -0.9347 -0.3578 0.1435 2.5420
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.25962 0.40094 3.142 0.00168 **
L 0.38823 0.02994 12.967 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(0.8017) family taken to be 1)
Null deviance: 160.693 on 56 degrees of freedom
Residual deviance: 67.976 on 55 degrees of freedom
AIC: 809.41
Number of Fisher Scoring iterations: 1
Theta: 0.802
Std. Err.: 0.140
2 x log-likelihood: -803.405
Upvotes: 1