Reputation: 11
I am a social scientist currently running a simple moderation model in R, in the form of y ~ x + m + m * x
. My moderator is a binary categorical variable (two separate groups).
I started out with lm(), bootstrapped estimates with boot() and obtained bca confidence intervals with boot.ci. Since there is no automated way of doing this for all parameters (at my coding level at least), this is bit tedious. Howver, I now saw that the lavaan
package offer bootstrapping as part of the regular sem()
function, and also bca CIs as part of parameterEstimates()
. So, I was wondering (since I am using lavaan in other analyses) whether I could just replace lm() with lavaan for the sake of keeping my work more consistent.
Doing this, I was wondering about what the equivalent model for lavaan would be to test for moderation in the same way. I saw this post where Jeremy Miles proposes the code below, which I follow mostly.
mod.1 <- "
y ~ c(a, b) * x
y ~~ c(v1, v1) * y # This step needed for exact equivalence
y ~ c(int1, int2) * 1
modEff := a - b
mEff := int1 - int2"
But it would be great if you could help me figure out some final things.
1) What does the y ~~ c(v1, v1) * y
part mean and and why is it needed for "exact equivalence" to the lm model? From the output it seems this constrics variances of the outcome for both groups to the same value?
2) From the post, am I right to understand that either including the interaction effect as calculated above OR constraining (only) the slope between models and looking at model fit with anova()
would be the same test for moderation?
3) The lavaan page says that adding test = "bootstrap"
to the sem()
function allows for boostrap adjusted p-values. However, I read a lot about p-values conflicting with the bca-CIs at times, and this has happened to me. Searching around, I understand that this conflict comes from the assumptions for the distribution of the data under the H0 for p-values, but not for CIs (which just give the range of most likely values). I was therefore wondering what it exactly means that the p-values given here are "bootstrap-adjusted"? Is it technically more true to report these for my SEM models than the CIs?
Many questions, but I would be very grateful for any help you can provide.
Best,
Alex
Upvotes: 1
Views: 3390
Reputation: 690
I think I can answer at least Nr. 1 and 2 of your questions but it is probably easier to not use SEM and instead program a function that conveniently gives you CIs for all coefficients of your model.
So first, to answer your questions:
x
. The binary moderator is implied by group="m"
when you fit the model with fit.1 <- sem(mod.1, data = df, group = "m")
(took this from the page you linked).c( , )
) in the lavaan code specify named parameters for the first and second group, respectively. By y ~~ c(v1, v1) * y
, the residual variances of y
are set equal in both groups because they have the same name. In contrast, the slopes c(a, b)
and the intercepts c(int1, int2)
are allowed to vary between groups.a == b
and compare the model this to the first version where the slopes can differ. This is the same as comparing lm()
models with and without a:b
(or a*b
) in the formula.boot.ci
to an lm
model fit, this might not be implemented. In the lavaan documentation BCa confidence intervals are only mentioned once: In the section about the parameterEstimates
function, which can also perform bootstrap (see p. 89). However, it does not produce actual BCa (bias-corrected and accelerated) CIs but only bias-corrected ones.As mentioned above, I guess the simplest solution would be to use lm()
and either repeat the boot.ci
procedure for each coefficient or write a wrapper function that does this for you. I suggest this also because a reviewer may be quite puzzled to see you do multigroup SEM instead of a simple moderated regression, which is much more common.
You probably did something like this already:
lm_fit <- function(dat, idx) coef( lm(y ~ x*m, data=dat[idx, ]) )
bs_out <- boot::boot(mydata, statistic=lm_fit, R=1000)
ci_out <- boot::boot.ci(bs_out, conf=.95, type="bca", index=1)
Now, either you repeat the last line for each coefficient, i.e., varying index
from 1 to 4. Or you get fancy and let R do the repeating with a function like this:
all_ci <- function(bs) {
est <- bs$t0
lower <- vector("numeric", length(bs$t0))
upper <- lower
for (i in 1:length(bs$t0)) {
ci <- tail(boot::boot.ci(bs, type="bca", index=i)$bca[1,], 2)
lower[i] <- ci[1]
upper[i] <- ci[2]
}
cbind(est, lower, upper)
}
all_ci(bs_out)
I am sure this could be written more concisely but it should work fine for bootstraps of simple lm()
models.
Upvotes: 2