Reputation: 63
I am trying to get a single factor from two variables (measured on 5-point Likert scale) using Confirmatory Factor Analysis (CFA). I understand that the degree of freedom for a model with 1 factor and 2 loads is -1 , and hence the model is under-specified. But, I have seen models where two variables are used as loadings for a single underlying factor.
I tried to run CFA in Python using sklearn, but it returned negative factor loadings for both loads, which I think is incorrect.
Python code (with data):
import sklearn.decomposition as skd
x = [[2., 4.], [1., 2.], [1., 1.], [2., 2.], [2., 2.], [2., 1.], [1., 1.], [2., 2.], [3., 2.], [2., 2.], [1., 2.], [1., 1.], [2., 2.], [2., 2.], [1., 1.], [3., 3.], [1., 1.], [1., 1.], [1., 1.], [1., 1.], [1., 1.], [0., 0.], [2., 2.], [2., 2.], [2., 2.], [2., 2.], [2., 2.], [1., 1.], [2., 2.], [2., 2.], [2., 2.], [2., 2.], [2., 2.], [2., 2.], [1., 1.], [2., 2.], [2., 2.], [1., 1.], [2., 2.], [2., 1.], [2., 2.], [3., 2.], [1., 1.], [1., 1.], [1., 1.], [2., 2.], [1., 1.], [1., 1.], [1., 1.], [1., 1.], [2., 2.], [1., 1.], [2., 4.], [2., 2.], [1., 1.], [2., 2.], [2., 2.], [3., 2.], [3., 2.], [1., 1.], [1., 1.], [2., 2.], [1., 1.], [1., 1.], [1., 2.], [1., 1.], [1., 1.], [2., 2.], [3., 3.], [1., 1.], [1., 1.], [1., 1.], [1., 1.], [2., 3.], [3., 3.], [2., 2.], [2., 2.], [1., 1.], [1., 1.], [1., 1.], [1., 1.], [1., 1.], [2., 2.], [1., 1.], [2., 2.], [1., 1.], [3., 3.], [2., 2.], [2., 2.], [2., 2.], [2., 2.], [1., 1.], [1., 1.], [2., 2.], [1., 1.], [1., 1.], [1., 1.], [1., 1.], [2., 2.], [2., 2.], [2., 2.], [1., 1.], [1., 2.], [2., 2.], [2., 2.], [2., 2.], [2., 2.], [2., 1.], [2., 2.], [1., 1.], [1., 1.], [2., 2.], [1., 1.], [1., 1.], [1., 1.], [1., 1.], [2., 2.], [2., 2.], [1., 2.], [1., 1.], [1., 1.], [2., 2.]]
skd.FactorAnalysis(n_components=1).fit(x).components_[0]
Output:
array([-0.55779804, -0.58890195])
I also tried to run CFA in R using 'lavaan' library, it is returning me the following error:
Warning message in lav_model_vcov(lavmodel = lavmodel, lavsamplestats = lavsamplestats, : "lavaan WARNING: Could not compute standard errors! The information matrix could not be inverted. This may be a symptom that the model is not identified."
I am new to CFA and Structural Equation Modeling (SEM), and would really appreciate if anyone can explain me my error (or should I say blunder!).
Upvotes: 1
Views: 611
Reputation: 143
Late answer, I know....
As you say, the real problem is that you don't have enough degrees of freedom: you need three indicators to estimate a latent variable in isolation. Yes, there are models with two indicators for a given latent variable, but they only work if that latent is correlated with one or more other latents.
With enough degrees of freedom, you'll never run into the problem that all loadings are negative, because one loading will always be fixed (arbitrarily) at 1. Mind you, if one or more loadings are negative, you can switch the direction of the latent's scale by switching which indicator has the fixed loading of 1--but that doesn't change the underlying math of the model. (Of course, if you fix one loading at -1--or any other negative number--you can certainly get a case where all loadings are negative, but there would rarely be a good reason to do that, and in any case, again, the underlying math would be the same.)
If you're doing CFA, though, I don't think you want to be using that particular scikit-learn class, which is intended for exploratory factor analysis (EFA). I guess it might work with only a single latent factor, though (and sklearn doesn't have a CFA class).
Upvotes: 1