danilofreire
danilofreire

Reputation: 503

Logistic regression with robust clustered standard errors in R

A newbie question: does anyone know how to run a logistic regression with clustered standard errors in R? In Stata it's just logit Y X1 X2 X3, vce(cluster Z), but unfortunately I haven't figured out how to do the same analysis in R. Thanks in advance!

Upvotes: 14

Views: 28614

Answers (4)

baruuum
baruuum

Reputation: 191

Another alternative would be to use the sandwich and lmtest package as follows. Suppose that z is a column with the cluster indicators in your dataset dat. Then

# load libraries
library("sandwich")
library("lmtest")

# fit the logistic regression
fit = glm(y ~ x, data = dat, family = binomial)

# get results with clustered standard errors (of type HC0)
coeftest(fit, vcov. = vcovCL(fit, cluster = dat$z, type = "HC0"))

will do the job.

Upvotes: 6

Jim Stankovich
Jim Stankovich

Reputation: 51

There is a command glm.cluster in the R package miceadds which seems to give the same results for logistic regression as Stata does with the option vce(cluster). See the documentation here.

In one of the examples on this page, the commands

mod2 <- miceadds::glm.cluster(data=dat, formula=highmath ~ hisei + female,
                              cluster="idschool", family="binomial")
summary(mod2)

give the same robust standard errors as the Stata command

logit highmath hisei female, vce(cluster idschool)

e.g. a standard error of 0.004038 for the variable hisei.

Upvotes: 5

David F
David F

Reputation: 1536

You might want to look at the rms (regression modelling strategies) package. So, lrm is logistic regression model, and if fit is the name of your output, you'd have something like this:

fit=lrm(disease ~ age + study + rcs(bmi,3), x=T, y=T, data=dataf)

fit

robcov(fit, cluster=dataf$id)

bootcov(fit,cluster=dataf$id)

You have to specify x=T, y=T in the model statement. rcs indicates restricted cubic splines with 3 knots.

Upvotes: 16

MichaelChirico
MichaelChirico

Reputation: 34703

I have been banging my head against this problem for the past two days; I magically found what appears to be a new package which seems destined for great things--for example, I am also running in my analysis some cluster-robust Tobit models, and this package has that functionality built in as well. Not to mention the syntax is much cleaner than in all the other solutions I've seen (we're talking near-Stata levels of clean).

So for your toy example, I'd run:

library(Zelig)
logit<-zelig(Y~X1+X2+X3,data=data,model="logit",robust=T,cluster="Z")

Et voilà!

Upvotes: 5

Related Questions