Tripartio
Tripartio

Reputation: 2179

How to do negative binomial regression with the rms package in R?

How can I use the rms package in R to execute a negative binomial regression? (I originally posted this question on Statistics SE, but it was closed apparently because it is a better fit here.)

With the MASS package, I use the glm.nb function, but I am trying to switch to the rms package because I sometimes get weird errors when bootstrapping with glm.nb and some other functions. But I cannot figure out how to do a negative binomial regression with the rms package.

Here is sample code of what I would like to do (copied from the rms::Glm function documentation):

library(rms)
## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
f <- Glm(counts ~ outcome + treatment, family=poisson())

f
anova(f)
summary(f, outcome=c('1','2','3'), treatment=c('1','2','3'))

So, instead of using family=poisson(), I would like to use something like family=negative.binomial(), but I cannot figure out how to do this.

In the documentation for family {stats}, I found this note in the "See also" section:

For binomial coefficients, choose; the binomial and negative binomial distributions, Binomial, and NegBinomial.

But even after clicking the link for ?NegBinomial, I cannot make any sense of this.

I would appreciate any help on how to use the rms package in R to execute a negative binomial regression.

Upvotes: 2

Views: 1702

Answers (2)

Ben Bolker
Ben Bolker

Reputation: 226182

opinion up front You might be better off posting (as a separate question) a reproducible example of the "weird errors" from your bootstrap attempts and seeing whether people have ideas for resolving them. It's fairly common for NB fitting procedures to throw warnings or errors when data are equi- or underdispersed, as the estimates of the dispersion parameter become infinite in this case ...

@coffeinjunky is correct that using family = negative.binomial(theta=VALUE) will work (where VALUE is a numeric constant, e.g. theta=1 for the geometric distribution [a special case of the NB]). However: you won't be able (without significantly more work) be able to fit the general NB model, i.e. the model where the dispersion parameter (theta) is estimated as part of the fitting procedure. That's what MASS::glm.nb does, and AFAICS there is no analogue in the rms package.

There are a few other packages/functions in addition to MASS::glm.nb that fit the negative binomial model, including (at least) bbmle and glmmTMB — there may be others such as gamlss.

## Dobson (1990) Page 93: Randomized Controlled Trial :
dd < data.frame(
   counts = c(18,17,15,20,10,20,25,13,12)
   outcome = gl(3,1,9),
   treatment = gl(3,3))

MASS::glm.nb

library(MASS)
m1 <- glm.nb(counts ~ outcome + treatment, data = dd)
## "iteration limit reached" warning

glmmTMB

library(glmmTMB)
m2 <- glmmTMB(counts ~ outcome + treatment, family = nbinom2, data = dd)
## "false convergence" warning

bbmle

library(bbmle)
m3 <- mle2(counts ~ dnbinom(mu = exp(logmu), size = exp(logtheta)),
     parameters = list(logmu ~outcome + treatment),
     data = dd,
     start = list(logmu = 0, logtheta = 0)
)
signif(cbind(MASS=coef(m1), glmmTMB=fixef(m2)$cond, bbmle=coef(m3)[1:5]), 5)
                   MASS     glmmTMB       bbmle
(Intercept)  3.0445e+00  3.04540000  3.0445e+00
outcome2    -4.5426e-01 -0.45397000 -4.5417e-01
outcome3    -2.9299e-01 -0.29253000 -2.9293e-01
treatment2  -1.1114e-06  0.00032174  8.1631e-06
treatment3  -1.9209e-06  0.00032823  6.5817e-06

These all agree fairly well (at least for the intercept/outcome parameters). This example is fairly difficult for a NB model (5 parameters + dispersion for 9 observations, data are Poisson rather than NB).

Upvotes: 3

coffeinjunky
coffeinjunky

Reputation: 11514

Based on this, the following seems to work:

library(rms)
library(MASS)

counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)

Glm(counts ~ outcome + treatment, family = negative.binomial(theta = 1))
General Linear Model
 
 rms::Glm(formula = counts ~ outcome + treatment, family = negative.binomial(theta = 1))
 
                    Model Likelihood    
                          Ratio Test    
    Obs       9    LR chi2      0.31    
 Residual d.f.4    d.f.            4    
    g 0.2383063    Pr(> chi2) 0.9892    
 
             Coef    S.E.   Wald Z Pr(>|Z|)
 Intercept    3.0756 0.2121 14.50  <0.0001 
 outcome=2   -0.4598 0.2333 -1.97  0.0487  
 outcome=3   -0.2962 0.2327 -1.27  0.2030  
 treatment=2 -0.0347 0.2333 -0.15  0.8819  
 treatment=3 -0.0503 0.2333 -0.22  0.8293 

Upvotes: 1

Related Questions