Reputation: 2179
How can I use the rms
package in R to execute a negative binomial regression? (I originally posted this question on Statistics SE, but it was closed apparently because it is a better fit here.)
With the MASS
package, I use the glm.nb
function, but I am trying to switch to the rms
package because I sometimes get weird errors when bootstrapping with glm.nb
and some other functions. But I cannot figure out how to do a negative binomial regression with the rms
package.
Here is sample code of what I would like to do (copied from the rms::Glm
function documentation):
library(rms)
## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
f <- Glm(counts ~ outcome + treatment, family=poisson())
f
anova(f)
summary(f, outcome=c('1','2','3'), treatment=c('1','2','3'))
So, instead of using family=poisson()
, I would like to use something like family=negative.binomial()
, but I cannot figure out how to do this.
In the documentation for family {stats}
, I found this note in the "See also" section:
For binomial coefficients, choose; the binomial and negative binomial distributions, Binomial, and NegBinomial.
But even after clicking the link for ?NegBinomial
, I cannot make any sense of this.
I would appreciate any help on how to use the rms
package in R to execute a negative binomial regression.
Upvotes: 2
Views: 1702
Reputation: 226182
opinion up front You might be better off posting (as a separate question) a reproducible example of the "weird errors" from your bootstrap attempts and seeing whether people have ideas for resolving them. It's fairly common for NB fitting procedures to throw warnings or errors when data are equi- or underdispersed, as the estimates of the dispersion parameter become infinite in this case ...
@coffeinjunky is correct that using family = negative.binomial(theta=VALUE)
will work (where VALUE
is a numeric constant, e.g. theta=1
for the geometric distribution [a special case of the NB]). However: you won't be able (without significantly more work) be able to fit the general NB model, i.e. the model where the dispersion parameter (theta) is estimated as part of the fitting procedure. That's what MASS::glm.nb
does, and AFAICS there is no analogue in the rms
package.
There are a few other packages/functions in addition to MASS::glm.nb
that fit the negative binomial model, including (at least) bbmle
and glmmTMB
— there may be others such as gamlss
.
## Dobson (1990) Page 93: Randomized Controlled Trial :
dd < data.frame(
counts = c(18,17,15,20,10,20,25,13,12)
outcome = gl(3,1,9),
treatment = gl(3,3))
library(MASS)
m1 <- glm.nb(counts ~ outcome + treatment, data = dd)
## "iteration limit reached" warning
library(glmmTMB)
m2 <- glmmTMB(counts ~ outcome + treatment, family = nbinom2, data = dd)
## "false convergence" warning
library(bbmle)
m3 <- mle2(counts ~ dnbinom(mu = exp(logmu), size = exp(logtheta)),
parameters = list(logmu ~outcome + treatment),
data = dd,
start = list(logmu = 0, logtheta = 0)
)
signif(cbind(MASS=coef(m1), glmmTMB=fixef(m2)$cond, bbmle=coef(m3)[1:5]), 5)
MASS glmmTMB bbmle
(Intercept) 3.0445e+00 3.04540000 3.0445e+00
outcome2 -4.5426e-01 -0.45397000 -4.5417e-01
outcome3 -2.9299e-01 -0.29253000 -2.9293e-01
treatment2 -1.1114e-06 0.00032174 8.1631e-06
treatment3 -1.9209e-06 0.00032823 6.5817e-06
These all agree fairly well (at least for the intercept/outcome parameters). This example is fairly difficult for a NB model (5 parameters + dispersion for 9 observations, data are Poisson rather than NB).
Upvotes: 3
Reputation: 11514
Based on this, the following seems to work:
library(rms)
library(MASS)
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
Glm(counts ~ outcome + treatment, family = negative.binomial(theta = 1))
General Linear Model
rms::Glm(formula = counts ~ outcome + treatment, family = negative.binomial(theta = 1))
Model Likelihood
Ratio Test
Obs 9 LR chi2 0.31
Residual d.f.4 d.f. 4
g 0.2383063 Pr(> chi2) 0.9892
Coef S.E. Wald Z Pr(>|Z|)
Intercept 3.0756 0.2121 14.50 <0.0001
outcome=2 -0.4598 0.2333 -1.97 0.0487
outcome=3 -0.2962 0.2327 -1.27 0.2030
treatment=2 -0.0347 0.2333 -0.15 0.8819
treatment=3 -0.0503 0.2333 -0.22 0.8293
Upvotes: 1