Reputation: 25
I'm not a Stata user so I'm trying to reproduce Stata results that are given to me in R. I would like to use a GLM with a complementary log-log function. The stata code I have is:
glm c IndA fia, family(binomial s) link(cloglog) offset(offset)
The R code is:
glmt <- glm(data=dataset, c ~ IndA + fia, offset = offset,
family = binomial(link = cloglog))
Which yields different results from the Stata output. I think the difference is in the variable s that is included in the binomial family in Stata (bold in the code). How can I incorporate this variable in the R code?
My dataset looks like:
IndA s c itot fia offset
1 23 0 61 0.442622951 -0.494296322
1 25 0 58 0.431034483 -0.544727175
1 27 0 59 0.389830508 -0.527632742
1 31 3 51 0.37254902 -0.673344553
1 28 2 53 0.41509434 -0.634878272
1 26 0 55 0.436363636 -0.597837001
...
where IndA is a dummy variable that is 0 later in the dataset. c is the difference in s (n - (n+1)).
The R output looks like this:
Call:
glm(formula = c ~ IndA + fia, family = binomial(link = cloglog),
data = dataset, offset = offset)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.2697 -0.9707 -0.8304 1.3688 1.6390
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.9633 1.9185 1.023 0.306
IndA -0.3174 0.3357 -0.945 0.344
fia -5.1155 4.8163 -1.062 0.288
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 136.81 on 101 degrees of freedom
Residual deviance: 134.71 on 99 degrees of freedom
AIC: 140.71
Number of Fisher Scoring iterations: 5
The Stata output is a little messed up but looks like this:
. glm c IndA fia, family(binomial s ) link(cloglog) offset(offset)
Iteration 0: log likelihood = -144.17967
Iteration 1: log likelihood = -133.66053
Iteration 2: log likelihood = -133.58996
Iteration 3: log likelihood = -133.58992
Iteration 4: log likelihood = -133.58992
Generalized linear models No. of obs = 102
Optimization : ML Residual df = 99
Scale parameter = 1
Deviance = 179.1806126 (1/df) Deviance = 1.809905
Pearson = 203.965157 (1/df) Pearson = 2.060254
Variance function: V(u) = u*(1-u/s) [Binomial]
Link function : g(u) = ln(-ln(1-u/s)) [Complementary log-log]
AIC = 2.678234
Log likelihood = -133.5899239 BIC = -278.6917
OIM
c Coef. Std. Err. z P>|z| [95% Conf. Interval]
IndA -.7284992 .2308676 -3.16 0.002 -1.180991 -.2760071
fia -7.147842 3.185532 -2.24 0.025 -13.39137 -.9043128
_cons .4404201 1.265651 0.35 0.728 -2.040211 2.921051
offset (offset)
This output is for the whole dataset:
IndA s c itot fia offset
1 23 0 61 0.442622951 -0.494296322
1 25 0 58 0.431034483 -0.544727175
1 27 0 59 0.389830508 -0.527632742
1 31 3 51 0.37254902 -0.673344553
1 28 2 53 0.41509434 -0.634878272
1 26 0 55 0.436363636 -0.597837001
1 26 0 52 0.461538462 -0.653926467
1 27 0 53 0.433962264 -0.634878272
1 29 1 50 0.42 -0.693147181
1 28 0 52 0.423076923 -0.653926467
1 28 0 56 0.392857143 -0.579818495
1 30 4 50 0.4 -0.693147181
1 26 0 57 0.421052632 -0.562118918
1 26 1 56 0.428571429 -0.579818495
1 25 0 58 0.431034483 -0.544727175
1 26 0 56 0.428571429 -0.579818495
1 29 3 54 0.388888889 -0.616186139
1 26 3 58 0.413793103 -0.544727175
1 23 0 62 0.435483871 -0.478035801
1 23 0 62 0.435483871 -0.478035801
1 25 0 59 0.423728814 -0.527632742
1 27 3 54 0.425925926 -0.616186139
1 24 0 60 0.433333333 -0.510825624
1 25 0 60 0.416666667 -0.510825624
1 25 0 60 0.416666667 -0.510825624
1 26 0 57 0.421052632 -0.562118918
1 27 0 55 0.418181818 -0.597837001
1 27 0 53 0.433962264 -0.634878272
1 27 0 55 0.418181818 -0.597837001
1 29 0 56 0.375 -0.579818495
1 31 0 53 0.358490566 -0.634878272
1 31 0 52 0.365384615 -0.653926467
1 34 0 50 0.32 -0.693147181
1 34 1 51 0.31372549 -0.673344553
1 33 5 55 0.309090909 -0.597837001
1 28 0 60 0.366666667 -0.510825624
1 28 1 58 0.379310345 -0.544727175
1 27 0 58 0.396551724 -0.544727175
1 28 0 58 0.379310345 -0.544727175
1 28 1 58 0.379310345 -0.544727175
1 27 0 59 0.389830508 -0.527632742
1 27 0 59 0.389830508 -0.527632742
1 27 0 57 0.403508772 -0.562118918
1 29 1 53 0.396226415 -0.634878272
1 28 0 55 0.4 -0.597837001
1 30 1 54 0.37037037 -0.616186139
1 29 0 54 0.388888889 -0.616186139
1 31 1 50 0.38 -0.693147181
1 30 0 57 0.350877193 -0.562118918
1 30 4 57 0.350877193 -0.562118918
1 26 0 61 0.393442623 -0.494296322
0 16 0 61 0.442622951 -0.494296322
0 17 3 58 0.431034483 -0.544727175
0 14 0 59 0.389830508 -0.527632742
0 18 0 51 0.37254902 -0.673344553
0 19 0 53 0.41509434 -0.634878272
0 19 0 55 0.436363636 -0.597837001
0 22 2 52 0.461538462 -0.653926467
0 20 0 53 0.433962264 -0.634878272
0 21 1 50 0.42 -0.693147181
0 20 4 52 0.423076923 -0.653926467
0 16 0 56 0.392857143 -0.579818495
0 20 3 50 0.4 -0.693147181
0 17 0 57 0.421052632 -0.562118918
0 18 1 56 0.428571429 -0.579818495
0 17 0 58 0.431034483 -0.544727175
0 18 1 56 0.428571429 -0.579818495
0 17 1 54 0.388888889 -0.616186139
0 16 1 58 0.413793103 -0.544727175
0 15 0 62 0.435483871 -0.478035801
0 15 0 62 0.435483871 -0.478035801
0 16 0 59 0.423728814 -0.527632742
0 19 3 54 0.425925926 -0.616186139
0 16 1 60 0.433333333 -0.510825624
0 15 0 60 0.416666667 -0.510825624
0 15 0 60 0.416666667 -0.510825624
0 17 0 57 0.421052632 -0.562118918
0 18 0 55 0.418181818 -0.597837001
0 20 2 53 0.433962264 -0.634878272
0 18 3 55 0.418181818 -0.597837001
0 15 0 56 0.375 -0.579818495
0 16 0 53 0.358490566 -0.634878272
0 17 1 52 0.365384615 -0.653926467
0 16 1 50 0.32 -0.693147181
0 15 3 51 0.31372549 -0.673344553
0 12 0 55 0.309090909 -0.597837001
0 12 0 60 0.366666667 -0.510825624
0 14 0 58 0.379310345 -0.544727175
0 15 1 58 0.396551724 -0.544727175
0 14 0 58 0.379310345 -0.544727175
0 14 0 58 0.379310345 -0.544727175
0 14 0 59 0.389830508 -0.527632742
0 14 0 59 0.389830508 -0.527632742
0 16 0 57 0.403508772 -0.562118918
0 18 1 53 0.396226415 -0.634878272
0 17 1 55 0.4 -0.597837001
0 16 0 54 0.37037037 -0.616186139
0 17 0 54 0.388888889 -0.616186139
0 19 6 50 0.38 -0.693147181
0 13 0 57 0.350877193 -0.562118918
0 13 0 57 0.350877193 -0.562118918
0 13 1 61 0.393442623 -0.494296322
Hope this helps.
Thanks in advance!
Upvotes: 0
Views: 711
Reputation: 56219
Stata family name
familyname Description ------------------------------------------------------------------------- gaussian Gaussian (normal) igaussian inverse Gaussian binomial[varnameN|#N] Bernoulli/binomial poisson Poisson nbinomial[#k|ml] negative binomial gamma gamma
Seems like s is one of your variables in dataset. It is hard to tell where this variable should go in R glm
without seeing your data structure.
See this link for stata glm examples.
The binomial distribution can be specified as 1) family(binomial), 2) family(binomial #N), or 3) family(binomial varnameN). In case 2, #N is the value of the binomial denominator N, the number of trials. Specifying family(binomial 1) is the same as specifying family(binomial). In case 3, varnameN is the variable containing the binomial denominator, allowing the number of trials to vary across observations.
Upvotes: 3