BurninLeo
BurninLeo

Reputation: 4484

Regression from error term to dependent variable (lavaan)

I want to test a structural equation model (SEM). There are 3 indicators, I1 to I3, that make up a latent construct LC. This construct should explain a dependent variable DV.

Now, assume that unique variance of the indicators will contribute additional explanation to the DV. Something like this:

IV1 ↖
IV2 ← LC → DV 
IV3 ↙      ↑
 ↑         │
 e3 ───────┘

In lavaan the error terms/residuals of IV3, e3, are usually not written:

model = '
  # latent variables
  LV =~ IV1 + IV2 + IV3
  # regression
  DV ~ LV
'

Further, the residual of I3 must be split into a compontent that contributes to explain DV, and one residual of the residual.

I do not want to explain DV directly by IV3, because its my goal to show how much unique explanation IV3 can contribute to DV. I want to maximize the path IV3LCDV, and then put the residual into I3DV.

Question:

How do I put this down in a SEM?

Bonus question:

Does it make sense from a SEM persective that each of the IVs has such a path to DV?

Side note:

What I already did, was to compute this traditionally, using a series of computations. I:

  1. Computed a pendant to LV, average of IV1 to IV3
  2. Did 3 regressions IVxLC
  3. Did a multiple regression of the IVxs residuals to DV.

Removing the common variance seems to make one of the residuals superfluous, so the regression model cannot estimate each of the residuals, but skips the last one.

Upvotes: 2

Views: 1033

Answers (1)

jsakaluk
jsakaluk

Reputation: 549

For your question:

How do I put this down in a SEM model? Is it possible at all?

The answer, I think, is yes--at least if I understand you correctly.

If what you want to do is predict an outcome using a latent variable and the unique variance of one of its indicators, this can be easily accomplished in lavaan. See example code below: the first example involves predicting an outcome from a latent variable alone, whereas the second example predicts the same outcome from the same latent variable as well as the unique variance of one of the indicators of that latent variable:

#Call lavaan and use HolzingerSwineford1939 data set
library(lavaan)
dat = HolzingerSwineford1939

#Model 1: x4 predicted by lv (visual)
model1 = '
visual =~ x1 + x2 + x3
x4 ~ visual
'
#Fit model 1 and get fit measures and r-squared estimates 
fit1 <- cfa(model1, data = dat, std.lv = T)
summary(fit1, fit.measures = TRUE, rsquare=T)

#Model 2: x4 predicted by lv (visual) and residual of x3
model2 = '
visual =~ x1 + x2 + x3
x4 ~ visual + x3
'
#Fit model 2 and get fit measures and r-squared estimates 
fit2 <- cfa(model2, data = dat, std.lv = T)
summary(fit2, fit.measures = TRUE,rsquare=T)

Notice that the R-squared for x4 (the hypothetical outcome) is much larger when predicted by both the latent variable onto which x3 loads, and x3's unique variance.

As for your second question:

Bonus question: Does that make sense? And even more: Does it make sense from a SEM view (theoretically is does) that each of the independet variables has such a path to DV?

It can make sense, in some cases, to specify such paths, but I would not do so in absentia of strong theory. For example, perhaps you think a variable is a weak, but theoretically important indicator of a greater latent variable--such as the experience of "awe" is for "positive affect". But perhaps your investigation isn't interested in the latent variable, per se--you are interested in the unique effects of awe for predicting something above and beyond its manifestation as a form of positive affect. You might therefore specify a regression pathway from the unique variance of awe to the outcome, in addition to the pathway from positive affect to the outcome.

But could/should you do this for each of your variables? Well, no, you couldn't. As you can see, this particular case only has one remaining degree of freedom, so the model is on the verge of being under-identified (and would be, if you specified the remaining two possible paths from the unique variances of x1 and x2 to the outcome of x4).

Moreover, I think many would be skeptical of your motivation for attempting to specify all these pathways. Modelling the pathway from the latent variable to the outcome allows you to speak to a broader process; what would you learn by modelling each and every pathway from unique variance to outcome? Sure, you might be able to say, "Well the remaining "stuff" in this variable predicts x4!"...but what could you say about the nature of that "stuff"--it's just isolated manifest variance. Instead, I think you would be on stronger theoretical ground to consider additional common factors that might underly the remaining variance of your variables (e.g., method factors); this would add more conceptual specificity to your analyses.

Upvotes: 3

Related Questions