Reputation: 77
I am interested in estimating a mixed effect model with two random components (I am sorry for the somewhat unprecise notation. I am somewhat new to these kind of models). Finally, I also want also the standard errors of the variances of the random components. That is why I am somewhat boudn to using the package lme
. The reason is that I found this description on how to calculate those standard errors and also interesting, the standard error for function of these variances link.
I believe I know how to use the package lmer
. I am finally interested in model2
. For the model1
, both command yield the same estimates. But model2
with lme
yields different results than model2
with lmer
from the lme4
package. Could you help me to get around how to set up the random components for lme
? This would be much appreciated. Thanks. Please find attached my MWE.
Best
Daniel
#### load all packages #####
loadpackage <- function(x){
for( i in x ){
# require returns TRUE invisibly if it was able to load package
if( ! require( i , character.only = TRUE ) ){
# If package was not able to be loaded then re-install
install.packages( i , dependencies = TRUE )
}
# Load package (after installing)
library( i , character.only = TRUE )
}
}
# Then try/install packages...
loadpackage( c("nlme", "msm", "lmeInfo", "lme4"))
alcohol1 <- read.table("https://stats.idre.ucla.edu/stat/r/examples/alda/data/alcohol1_pp.txt", header=T, sep=",")
attach(alcohol1)
id <- as.factor(id)
age <- as.factor(age)
model1.lmer <-lmer(alcuse ~ 1 + peer + (1|id))
summary(model1.lmer)
model2.lmer <-lmer(alcuse ~ 1 + peer + (1|id) + (1|age))
summary(model2.lmer)
model1.lme <- lme(alcuse ~ 1+ peer, data = alcohol1, random = ~ 1 |id, method ="REML")
summary(model1.lme)
model2.lme <- lme(alcuse ~ 1+ peer, data = alcohol1, random = ~ 1 |id + 1|age, method ="REML")
Edit (15/09/2021):
Estimating the model as follows end then returning the estimates via nlme::VarCorr
gives me different results. While the estimates seem to be in the ball park, it is as they are switched across components.
model2a.lme <- lme(alcuse ~ 1+ peer, data = alcohol1, random = ~ 1 |id/age, method ="REML")
summary(model2a.lme)
nlme::VarCorr(model2a.lme)
Variance StdDev
id = pdLogChol(1)
(Intercept) 0.38390274 0.6195989
age = pdLogChol(1)
(Intercept) 0.47892113 0.6920413
Residual 0.08282585 0.2877948
EDIT (16/09/2021):
Since Bob pushed me to think more about my model, I want to give some additional information. Please know that the data I use in the MWE do not match my true data. I just used it for illustrative purposes since I can not upload myy true data. I have a household panel with income, demographic informations and parent indicators.
I am interested in intergenerational mobility. Sibling correlations of permanent income are one industry standard. At the very least, contemporanous observations are very bad proxies of permanent income. Due to transitory shocks, i.e., classical measurement error, those estimates are most certainly attenuated. For this reason, we exploit the longitudinal dimension of our data.
For sibling correlations, this amounts to hypothesising that the income process is as follows:
$$Y_{ijt} = \beta X_{ijt} + \epsilon_{ijt}.$$
With Y being income from individual i from family j in year t. X comprises age and survey year indicators to account for life-cycle effects and macroeconmic conditions in survey years. Epsilon is a compund term comprising a random individual and family component as well as a transitory component (measurement error and short lived shocks). It looks as follows:
$$\epsilon_{ijt} = \alpha_i + \gamma_j + \eta_{ijt}.$$
The variance of income is then:
$$\sigma^2_\epsilon = \sigma^2_\alpha + \sigma^2\gamma + \sigma^2\eta.$$
The quantitiy we are interested in is
$$\rho = \frac(\sigma^2\gamma}{\sigma^2_\alpha + \sigma^2\gamma},$$
which reflects the share of shared family (and other characteristics) among siblings of the variation in permanent income.
B.t.w.: The struggle is simply because I want to have a standard errors for all estimates and for \rho.
Upvotes: 1
Views: 842
Reputation: 77
Okay, finally. Just to sketch my confidential data: I have a panel of individuals. The data includes siblings, identified via mnr. income is earnings, wavey survey year, age age factors. female a factor for gender, pid is the factor identifying the individual.
m1 <- lmer(income ~ age + wavey + female + (1|pid) + (1 | mnr),
data = panel)
vv <- vcov(m1, full = TRUE)
covvar <- vv[58:60, 58:60]
covvar
3 x 3 Matrix of class "dgeMatrix"
cov_pid.(Intercept) cov_mnr.(Intercept) residual
[1,] 2.6528679 -1.4624588 -0.4077576
[2,] -1.4624588 3.1015001 -0.0597926
[3,] -0.4077576 -0.0597926 1.1634680
mean <- as.data.frame(VarCorr(m1))$vcov
mean
[1] 17.92341 16.86084 56.77185
deltamethod(~ x2/(x1+x2), mean, covvar, ses =TRUE)
[1] 0.04242089
The last scalar should be what I interprete as the shared background of the siblings of permanent income.
Thanks to @Ben Bolker who pointed me into this direction.
Upvotes: 0
Reputation: 226162
This is an example of crossed vs nested random effects. (Note that the example you refer to is fitting a different kind of model, a random-slopes model rather than a model with two different grouping variables ...)
If you try with(alcohol1, table(age,id))
you can see that every id
is associated with every possible age (14, 15, 16). Or subset(alcohol1, id==1)
for example:
id age coa male age_14 alcuse peer cpeer ccoa
1 1 14 1 0 0 1.732051 1.264911 0.2469111 0.549
2 1 15 1 0 1 2.000000 1.264911 0.2469111 0.549
3 1 16 1 0 2 2.000000 1.264911 0.2469111 0.549
There are three possible models you could fit for a model with random effects of age
(indexed by i
) and id
(indexed by j
)
(1|age) + (1|id)
): Y_{ij} = beta0 + beta1*peer + eps1_i + eps2_j +epsr_{ij}
; alcohol use varies among individuals and, independently, across ages (this model won't work very well because there are only three distinct ages in the data set, more levels are usually needed)(1|age/id)
= (1|age) + (1|age:id)
): Y_{ij} = beta0 + beta1*peer + eps1_i + eps2_{ij} + epsr_{ij}
; alcohol use varies across ages, and varies across individuals within ages (see note above about number of levels).(1|id/age)
= (1|id) + (1|age:id)
): Y_{ij} = beta0 + beta1*peer + eps1_j + eps2_{ij} + epsr_{ij}
; alcohol use varies across individuals, and varies across ages within individualsHere eps1_i
, eps2_{ij}
, and epsr_{ij}
are normal deviates; epsr
is the residual error term.
The latter two models actually don't make sense in this case; because there is only one observation per age/id combination, the nested variance (eps2
) is completely confounded with the residual variance (epsr
). lme
doesn't notice this; if you try to fit one of the nested models in lmer
it will give an error that
number of levels of each grouping factor must be < number of observations (problems: id:age)
(Although if you try to compute confidence intervals based on model1.lme
you'll get an error "cannot get confidence intervals on var-cov components: Non-positive definite approximate variance-covariance", which is a hint that something is wrong.)
You could restate this problem as saying that the residual variation, and the variation among ages within individuals, are jointly unidentifiable (can't be separated from each other, statistically).
The updated answer here shows how to get the standard errors of the variance components from an lmer
model, so you shouldn't be stuck with lme
(but you should think carefully about which model you're really trying to fit ...)
The GLMM FAQ might also be useful.
More generally, the standard error of
rho = (V_gamma)/(V_alpha + V_gamma)
will be hard to compute accurately, because this is a nonlinear function of the model parameters. You can apply the delta method, but the most reliable approach would be to use parametric bootstrapping: if you have a fitted model m
, then something like this should work:
var_ratio <- function(m) {
v <- as.data.frame(sapply(VarCorr(m), as.numeric))
return(v$family/(v$family + v$id))
}
confint(m, method="boot", FUN =var_ratio)
Upvotes: 2
Reputation: 15123
You should specify random effects in lme
by using /
not +
lmer
model2.lmer <-lmer(alcuse ~ 1 + peer + (1|id) + (1|age), data = alcohol1)
summary(model2.lmer)
Linear mixed model fit by REML ['lmerMod']
Formula: alcuse ~ 1 + peer + (1 | id) + (1 | age)
Data: alcohol1
REML criterion at convergence: 651.3
Scaled residuals:
Min 1Q Median 3Q Max
-2.0228 -0.5310 -0.1329 0.5854 3.1545
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 0.08078 0.2842
age (Intercept) 0.30313 0.5506
Residual 0.56175 0.7495
Number of obs: 246, groups: id, 82; age, 82
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.3039 0.1438 2.113
peer 0.6074 0.1151 5.276
Correlation of Fixed Effects:
(Intr)
peer -0.814
lme
model2.lme <- lme(alcuse ~ 1+ peer, data = alcohol1, random = ~ 1 |id/age, method ="REML")
summary(model2.lme)
Linear mixed-effects model fit by REML
Data: alcohol1
AIC BIC logLik
661.3109 678.7967 -325.6554
Random effects:
Formula: ~1 | id
(Intercept)
StdDev: 0.4381206
Formula: ~1 | age %in% id
(Intercept) Residual
StdDev: 0.4381203 0.7494988
Fixed effects: alcuse ~ 1 + peer
Value Std.Error DF t-value p-value
(Intercept) 0.3038946 0.1438333 164 2.112825 0.0361
peer 0.6073948 0.1151228 80 5.276060 0.0000
Correlation:
(Intr)
peer -0.814
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.0227793 -0.5309669 -0.1329302 0.5853768 3.1544873
Number of Observations: 246
Number of Groups:
id age %in% id
82 82
Upvotes: 0