AleKir
AleKir

Reputation: 11

Model Syntax for Simple Moderation Model in Lavaan (with bootstrapping)

I am a social scientist currently running a simple moderation model in R, in the form of y ~ x + m + m * x. My moderator is a binary categorical variable (two separate groups).

I started out with lm(), bootstrapped estimates with boot() and obtained bca confidence intervals with boot.ci. Since there is no automated way of doing this for all parameters (at my coding level at least), this is bit tedious. Howver, I now saw that the lavaan package offer bootstrapping as part of the regular sem() function, and also bca CIs as part of parameterEstimates(). So, I was wondering (since I am using lavaan in other analyses) whether I could just replace lm() with lavaan for the sake of keeping my work more consistent.

Doing this, I was wondering about what the equivalent model for lavaan would be to test for moderation in the same way. I saw this post where Jeremy Miles proposes the code below, which I follow mostly.

mod.1 <- "
 y ~ c(a, b) * x
 y ~~ c(v1, v1) * y  # This step needed for exact equivalence
 y ~ c(int1, int2) * 1
 modEff := a - b
 mEff := int1 - int2"

But it would be great if you could help me figure out some final things.

1) What does the y ~~ c(v1, v1) * y part mean and and why is it needed for "exact equivalence" to the lm model? From the output it seems this constrics variances of the outcome for both groups to the same value?

2) From the post, am I right to understand that either including the interaction effect as calculated above OR constraining (only) the slope between models and looking at model fit with anova()would be the same test for moderation?

3) The lavaan page says that adding test = "bootstrap" to the sem() function allows for boostrap adjusted p-values. However, I read a lot about p-values conflicting with the bca-CIs at times, and this has happened to me. Searching around, I understand that this conflict comes from the assumptions for the distribution of the data under the H0 for p-values, but not for CIs (which just give the range of most likely values). I was therefore wondering what it exactly means that the p-values given here are "bootstrap-adjusted"? Is it technically more true to report these for my SEM models than the CIs?

Many questions, but I would be very grateful for any help you can provide.

Best,

Alex

Upvotes: 1

Views: 3390

Answers (1)

benimwolfspelz
benimwolfspelz

Reputation: 690

I think I can answer at least Nr. 1 and 2 of your questions but it is probably easier to not use SEM and instead program a function that conveniently gives you CIs for all coefficients of your model.

So first, to answer your questions:

  1. What is proposed in the code you gave is called multigroup comparison. Essentially this means that you fit the same SEM to two different groups of cases in your dataset. It is equivalent to a moderated regression with binary moderator because in both cases you get two slopes (often called „simple slopes“) for the scalar predictor, one slope per group of the moderator.
    Now, in your lavaan code you only see the scalar predictor x. The binary moderator is implied by group="m" when you fit the model with fit.1 <- sem(mod.1, data = df, group = "m") (took this from the page you linked).
    The two-element vectors (c( , )) in the lavaan code specify named parameters for the first and second group, respectively. By y ~~ c(v1, v1) * y , the residual variances of y are set equal in both groups because they have the same name. In contrast, the slopes c(a, b) and the intercepts c(int1, int2) are allowed to vary between groups.
  2. Yes. If you use the SEM, you would fit the model a second time adding a == b and compare the model this to the first version where the slopes can differ. This is the same as comparing lm() models with and without a:b (or a*b) in the formula.
  3. Here I cannot provide a direct answer to your question. I suspect if you want BCa CIs as you would get from applying boot.ci to an lm model fit, this might not be implemented. In the lavaan documentation BCa confidence intervals are only mentioned once: In the section about the parameterEstimates function, which can also perform bootstrap (see p. 89). However, it does not produce actual BCa (bias-corrected and accelerated) CIs but only bias-corrected ones.

As mentioned above, I guess the simplest solution would be to use lm() and either repeat the boot.ci procedure for each coefficient or write a wrapper function that does this for you. I suggest this also because a reviewer may be quite puzzled to see you do multigroup SEM instead of a simple moderated regression, which is much more common.

You probably did something like this already:

lm_fit <- function(dat, idx) coef( lm(y ~ x*m, data=dat[idx, ]) )
bs_out <- boot::boot(mydata, statistic=lm_fit, R=1000)
ci_out <- boot::boot.ci(bs_out, conf=.95, type="bca", index=1)

Now, either you repeat the last line for each coefficient, i.e., varying index from 1 to 4. Or you get fancy and let R do the repeating with a function like this:

all_ci <- function(bs) {
  est <- bs$t0
  lower <- vector("numeric", length(bs$t0))
  upper <- lower
  for (i in 1:length(bs$t0)) {
    ci <- tail(boot::boot.ci(bs, type="bca", index=i)$bca[1,], 2)
    lower[i] <- ci[1]
    upper[i] <- ci[2]
  }
  cbind(est, lower, upper)
}

all_ci(bs_out)

I am sure this could be written more concisely but it should work fine for bootstraps of simple lm() models.

Upvotes: 2

Related Questions