Noah
Noah

Reputation: 21

Poor Recovery of Factor Loadings and Covariances in R Using Monte Carlo in Tidyverse

I am using secondary data to build a measurement model in R and, for proof of concept, I am trying to simulate data from this measurement model and fit it to the original model to show that it will produce similar parameter estimates.

The secondary data I am using to build the measurement model is provided in Table 3 of the following article: Kwon, J. Y., & Sawatzky, R. (2017). Examining gender-related differential item functioning of the Veterans Rand 12-item Health Survey. Quality of Life Research, 26, 2877-2883. https://doi.org/10.1007/s11136-017-1638-x

When I use the Kwon and Sawatzky (2017) model to simulate data and then use the simulated data to fit to the original measurement model, the recovery of the thresholds are near perfect, however, the factor loadings are a bit off and the covariances are very off.

I've being doing trial and error, and these are the best parameter estimates I can pull currently. I performed this process seamlessly using secondary data from another measure (the WHOQOL-BREF) where the recovery of the loadings, intercepts, and covariances were very similar. So, I must be doing something wrong here but I'm unsure what exactly. Below, I've provided the script I've used and the output I get, respectively.

Script

#populated with values from Kwon & Sawatzky (2017)
sf12_script<- '

Phys =~ 1*gen + .1.76*mod_active + 1.46*stairs + 1.94*phys_accomp + 2.24*work +
          0*emo_accomp + 0*careful + 1.66*pain + .69*social + 0*calm +
          .93*energy + 0*blue
Ment =~ .14*gen + 0*mod_active + 0*stairs + 0*phys_accomp + 0*work +
          1*emo_accomp + .82*careful + 0*pain + .46*social + .53*calm + 
          .24*energy + .59*blue

#item thresholds
gen | -2.27*t1 + -0.77*t2 + 0.67*t3 + 2.21*t4
mod_active | -1.42*t1 + 0.33*t2
stairs | -0.91*t1 + 0.67*t2
phys_accomp | -3.04*t1 + -1.62*t2 + -0.41*t3 + 0.60*t4
work | -3.26*t1 + -1.74*t2 + -0.47*t3 + 0.66*t4
emo_accomp | -4.58*t1 + -3.18*t2 + -1.92*t3 + -0.98*t4
careful | -4.14*t1  + -3.00*t2 + -2.00*t3 + -1.23*t4
pain | -2.74*t1 + -1.34*t2 + -0.48*t3 +0.83*t4
social | -3.05*t1 + -1.98*t2 + -0.87*t3 + -.109*t4
calm | -2.99*t1 + -1.91*t2 + -0.91*t3 + -0.35*t4 + 1.53*t5
energy | -2.47*t1 + -1.29*t2 + -0.20*t3 + 0.43*t4 + 2.37*t5
blue | -3.33*t1 + -2.42*t2 + -1.90*t3 + -0.78*t4 + 0.35*t5

#between item correlations
mod_active ~~ 0.46*stairs
phys_accomp ~~ 0.41*work
emo_accomp ~~ 0.41*careful
calm ~~ 0.09*blue
'

#simulate data from the parameter estimates in Kwon & Sawatzky (2017)
sim_samp_sf12 <- simulateData(sf12_script, sample.nobs = 277518, parameterization = 'theta')

#the script for the two-factor model 
sf12_empty_script <- '

Phys =~ 1*gen + mod_active + stairs + phys_accomp + work + pain + social + energy
Ment =~ 1*emo_accomp + gen + careful + social + calm + energy + blue

#populating item thresholds
gen | t1 + t2 + t3 + t4
mod_active | t1 + t2
stairs | t1 + t2
phys_accomp | t1 + t2 + t3 + t4
work | t1 + t2 + t3 + t4
emo_accomp | t1 + t2 + t3 + t4
careful | t1  + t2 + t3 + t4
pain | t1 + t2 + t3 + t4
social | t1 + t2 + t3 + t4
calm | t1 + t2 + t3 + t4 + t5
energy | t1 + t2 + t3 + t4 + t5
blue | t1 + t2 + t3 + t4 + t5

#between item correlations
mod_active ~~ stairs
phys_accomp ~~ work
emo_accomp ~~ careful
calm ~~ blue
'

#fit the model to the data generated from their sample values
sf12_fit_simsamp <- cfa(sf12_empty_script, data = sim_samp_sf12, std.lv = FALSE)

#check fit
summary(sf12_fit_simsamp, standardized = TRUE, fit.measures = TRUE, rsquare = TRUE)

Output

It may take a couple tries to get the model to converge, but when it does, this is the output:

lavaan 0.6-19 ended normally after 42 iterations

  Estimator                                       DWLS
  Optimization method                           NLMINB
  Number of model parameters                        67

  Number of observations                        277518

Model Test User Model:
                                              Standard      Scaled
  Test Statistic                                35.633      55.199
  Degrees of freedom                                46          46
  P-value (Chi-square)                           0.865       0.166
  Scaling correction factor                                  0.919
  Shift parameter                                           16.428
    simple second-order correction                                

Model Test Baseline Model:

  Test statistic                           6784656.349 3868007.538
  Degrees of freedom                                66          66
  P-value                                        0.000       0.000
  Scaling correction factor                                  1.754

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    1.000       1.000
  Tucker-Lewis Index (TLI)                       1.000       1.000
                                                                  
  Robust Comparative Fit Index (CFI)                         1.000
  Robust Tucker-Lewis Index (TLI)                            1.000

Root Mean Square Error of Approximation:

  RMSEA                                          0.000       0.001
  90 Percent confidence interval - lower         0.000       0.000
  90 Percent confidence interval - upper         0.001       0.002
  P-value H_0: RMSEA <= 0.050                    1.000       1.000
  P-value H_0: RMSEA >= 0.080                    0.000       0.000
                                                                  
  Robust RMSEA                                               0.001
  90 Percent confidence interval - lower                     0.000
  90 Percent confidence interval - upper                     0.002
  P-value H_0: Robust RMSEA <= 0.050                         1.000
  P-value H_0: Robust RMSEA >= 0.080                         0.000

Standardized Root Mean Square Residual:

  SRMR                                           0.002       0.002

Parameter Estimates:

  Parameterization                               Delta
  Standard errors                           Robust.sem
  Information                                 Expected
  Information saturated (h1) model        Unstructured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv
  Phys =~                                                      
    gen               1.000                               0.704
    mod_active        1.005    0.003  398.066    0.000    0.708
    stairs            1.172    0.002  484.566    0.000    0.825
    phys_accomp       1.264    0.002  506.441    0.000    0.890
    work              1.298    0.003  512.201    0.000    0.914
    pain              1.217    0.002  514.998    0.000    0.856
    social            0.758    0.003  274.971    0.000    0.533
    energy            0.951    0.002  407.182    0.000    0.670
  Ment =~                                                      
    emo_accomp        1.000                               0.703
    gen               0.138    0.004   38.977    0.000    0.097
    careful           0.882    0.007  135.161    0.000    0.620
    social            0.503    0.006   91.128    0.000    0.353
    calm              0.664    0.010   67.562    0.000    0.467
    energy            0.246    0.004   64.542    0.000    0.173
    blue              0.722    0.010   69.175    0.000    0.508
  Std.all
         
    0.704
    0.708
    0.825
    0.890
    0.914
    0.856
    0.533
    0.670
         
    0.703
    0.097
    0.620
    0.353
    0.467
    0.173
    0.508

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv
 .mod_active ~~                                                
   .stairs            0.185    0.002  122.614    0.000    0.185
 .phys_accomp ~~                                               
   .work              0.076    0.001   69.446    0.000    0.076
 .emo_accomp ~~                                                
   .careful           0.235    0.007   35.020    0.000    0.235
 .calm ~~                                                      
   .blue              0.068    0.004   17.882    0.000    0.068
  Phys ~~                                                      
    Ment              0.001    0.002    0.385    0.700    0.001
  Std.all
         
    0.463
         
    0.409
         
    0.422
         
    0.090
         
    0.001

Thresholds:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv
    gen|t1           -2.273    0.007 -338.199    0.000   -2.273
    gen|t2           -0.769    0.003 -289.931    0.000   -0.769
    gen|t3            0.671    0.003  259.767    0.000    0.671
    gen|t4            2.213    0.006  348.863    0.000    2.213
    mod_active|t1    -1.418    0.003 -406.378    0.000   -1.418
    mod_active|t2     0.328    0.002  135.367    0.000    0.328
    stairs|t1        -0.911    0.003 -328.370    0.000   -0.911
    stairs|t2         0.668    0.003  258.736    0.000    0.668
    phys_accomp|t1   -3.021    0.016 -186.734    0.000   -3.021
    phys_accomp|t2   -1.623    0.004 -410.502    0.000   -1.623
    phys_accomp|t3   -0.413    0.002 -168.227    0.000   -0.413
    phys_accomp|t4    0.598    0.003  235.254    0.000    0.598
    work|t1          -3.254    0.023 -144.007    0.000   -3.254
    work|t2          -1.738    0.004 -406.277    0.000   -1.738
    work|t3          -0.471    0.002 -190.042    0.000   -0.471
    work|t4           0.659    0.003  255.619    0.000    0.659
    emo_accomp|t1    -4.337    0.155  -27.900    0.000   -4.337
    emo_accomp|t2    -3.213    0.021 -151.006    0.000   -3.213
    emo_accomp|t3    -1.918    0.005 -391.297    0.000   -1.918
    emo_accomp|t4    -0.978    0.003 -344.042    0.000   -0.978
    careful|t1       -4.183    0.114  -36.806    0.000   -4.183
    careful|t2       -3.011    0.016 -188.561    0.000   -3.011
    careful|t3       -2.007    0.005 -380.594    0.000   -2.007
    careful|t4       -1.236    0.003 -389.542    0.000   -1.236
    pain|t1          -2.752    0.011 -241.369    0.000   -2.752
    pain|t2          -1.344    0.003 -401.043    0.000   -1.344
    pain|t3          -0.481    0.002 -193.789    0.000   -0.481
    pain|t4           0.826    0.003  306.116    0.000    0.826
    social|t1        -3.031    0.016 -184.689    0.000   -3.031
    social|t2        -1.974    0.005 -384.755    0.000   -1.974
    social|t3        -0.869    0.003 -317.556    0.000   -0.869
    social|t4        -0.110    0.002  -46.123    0.000   -0.110
    calm|t1          -2.978    0.015 -195.155    0.000   -2.978
    calm|t2          -1.906    0.005 -392.568    0.000   -1.906
    calm|t3          -0.906    0.003 -326.989    0.000   -0.906
    calm|t4          -0.346    0.002 -142.478    0.000   -0.346
    calm|t5           1.536    0.004  410.634    0.000    1.536
    energy|t1        -2.467    0.008 -300.607    0.000   -2.467
    energy|t2        -1.289    0.003 -395.730    0.000   -1.289
    energy|t3        -0.202    0.002  -84.319    0.000   -0.202
    energy|t4         0.427    0.002  173.582    0.000    0.427
    energy|t5         2.371    0.007  319.706    0.000    2.371
    blue|t1          -3.358    0.026 -126.846    0.000   -3.358
    blue|t2          -2.413    0.008 -311.422    0.000   -2.413
    blue|t3          -1.900    0.005 -393.287    0.000   -1.900
    blue|t4          -0.779    0.003 -292.660    0.000   -0.779
    blue|t5           0.351    0.002  144.389    0.000    0.351
  Std.all
   -2.273
   -0.769
    0.671
    2.213
   -1.418
    0.328
   -0.911
    0.668
   -3.021
   -1.623
   -0.413
    0.598
   -3.254
   -1.738
   -0.471
    0.659
   -4.337
   -3.213
   -1.918
   -0.978
   -4.183
   -3.011
   -2.007
   -1.236
   -2.752
   -1.344
   -0.481
    0.826
   -3.031
   -1.974
   -0.869
   -0.110
   -2.978
   -1.906
   -0.906
   -0.346
    1.536
   -2.467
   -1.289
   -0.202
    0.427
    2.371
   -3.358
   -2.413
   -1.900
   -0.779
    0.351

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv
   .gen               0.495                               0.495
   .mod_active        0.499                               0.499
   .stairs            0.319                               0.319
   .phys_accomp       0.209                               0.209
   .work              0.165                               0.165
   .pain              0.267                               0.267
   .social            0.590                               0.590
   .energy            0.521                               0.521
   .emo_accomp        0.506                               0.506
   .careful           0.616                               0.616
   .calm              0.782                               0.782
   .blue              0.742                               0.742
    Phys              0.496    0.002  274.004    0.000    1.000
    Ment              0.494    0.008   64.706    0.000    1.000
  Std.all
    0.495
    0.499
    0.319
    0.209
    0.165
    0.267
    0.590
    0.521
    0.506
    0.616
    0.782
    0.742
    1.000
    1.000

R-Square:
                   Estimate
    gen               0.505
    mod_active        0.501
    stairs            0.681
    phys_accomp       0.791
    work              0.835
    pain              0.733
    social            0.410
    energy            0.479
    emo_accomp        0.494
    careful           0.384
    calm              0.218
    blue              0.258

My best guesses:

  1. Something is going wrong because I'm using parameter estimates from a strict invariance model across sex and I need to add additional lavaan syntax to get the model to be correctly specified. What syntax, though, I'm unsure about.

  2. I'm wondering if including the factor means and co-varainces parameters would improve recovery? I wasn't sure which though as they provide two values (one for males, one for females). Is this a scenario where I can choose one and, as long as I'm consistent with the group I choose, it should be fine? If so, then I've attempted that (using values provided from just females and just males) and that also didn't seem to improve recovery.

  3. Something in the Methods/Data Analysis/Results section of the Kwon & Sawatzky (2017) article has gone completely over my head and I've specified the model wrong.

  4. The issues with model convergence is not due to bad draws in random probability sampling and is reflective of a deeper (unknown to me) issue.

UPDATED OUTPUT WITH parameterization = theta in cfa()*

lavaan 0.6-19 ended normally after 158 iterations

  Estimator                                       DWLS
  Optimization method                           NLMINB
  Number of model parameters                        67

  Number of observations                        277518

Model Test User Model:
                                              Standard      Scaled
  Test Statistic                                21.883      40.243
  Degrees of freedom                                46          46
  P-value (Chi-square)                           0.999       0.711
  Scaling correction factor                                  0.909
  Shift parameter                                           16.164
    simple second-order correction                                

Model Test Baseline Model:

  Test statistic                           6652355.468 3798046.688
  Degrees of freedom                                66          66
  P-value                                        0.000       0.000
  Scaling correction factor                                  1.752

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    1.000       1.000
  Tucker-Lewis Index (TLI)                       1.000       1.000
                                                                  
  Robust Comparative Fit Index (CFI)                         1.000
  Robust Tucker-Lewis Index (TLI)                            1.000

Root Mean Square Error of Approximation:

  RMSEA                                          0.000       0.000
  90 Percent confidence interval - lower         0.000       0.000
  90 Percent confidence interval - upper         0.000       0.001
  P-value H_0: RMSEA <= 0.050                    1.000       1.000
  P-value H_0: RMSEA >= 0.080                    0.000       0.000
                                                                  
  Robust RMSEA                                               0.001
  90 Percent confidence interval - lower                     0.000
  90 Percent confidence interval - upper                     0.003
  P-value H_0: Robust RMSEA <= 0.050                         1.000
  P-value H_0: Robust RMSEA >= 0.080                         0.000

Standardized Root Mean Square Residual:

  SRMR                                           0.001       0.001

Parameter Estimates:

  Parameterization                               Theta
  Standard errors                           Robust.sem
  Information                                 Expected
  Information saturated (h1) model        Unstructured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  Phys =~                                                               
    gen               1.000                               0.992    0.701
    mod_active        1.001    0.005  199.947    0.000    0.993    0.705
    stairs            1.463    0.007  204.028    0.000    1.452    0.824
    phys_accomp       1.970    0.011  184.910    0.000    1.955    0.890
    work              2.263    0.013  170.206    0.000    2.245    0.914
    pain              1.662    0.008  211.687    0.000    1.649    0.855
    social            0.688    0.004  176.206    0.000    0.683    0.527
    energy            0.928    0.004  213.266    0.000    0.921    0.667
  Ment =~                                                               
    emo_accomp        1.000                               1.003    0.708
    gen               0.136    0.004   34.241    0.000    0.136    0.096
    careful           0.832    0.011   77.908    0.000    0.835    0.641
    social            0.456    0.008   55.626    0.000    0.458    0.354
    calm              0.526    0.012   42.346    0.000    0.528    0.467
    energy            0.239    0.005   48.297    0.000    0.240    0.174
    blue              0.588    0.014   42.056    0.000    0.590    0.508

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
 .mod_active ~~                                                         
   .stairs            0.458    0.003  154.632    0.000    0.458    0.458
 .phys_accomp ~~                                                        
   .work              0.407    0.004   99.675    0.000    0.407    0.407
 .emo_accomp ~~                                                         
   .careful           0.406    0.008   50.231    0.000    0.406    0.406
 .calm ~~                                                               
   .blue              0.087    0.005   18.869    0.000    0.087    0.087
  Phys ~~                                                               
    Ment              0.004    0.003    1.177    0.239    0.004    0.004

Thresholds:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    gen|t1           -3.215    0.010 -314.839    0.000   -3.215   -2.271
    gen|t2           -1.096    0.004 -275.397    0.000   -1.096   -0.774
    gen|t3            0.942    0.004  247.455    0.000    0.942    0.666
    gen|t4            3.135    0.010  317.890    0.000    3.135    2.215
    mod_active|t1    -2.006    0.006 -352.683    0.000   -2.006   -1.424
    mod_active|t2     0.469    0.003  134.990    0.000    0.469    0.332
    stairs|t1        -1.606    0.006 -273.788    0.000   -1.606   -0.911
    stairs|t2         1.189    0.005  231.426    0.000    1.189    0.674
    phys_accomp|t1   -6.745    0.043 -156.841    0.000   -6.745   -3.072
    phys_accomp|t2   -3.560    0.014 -263.441    0.000   -3.560   -1.622
    phys_accomp|t3   -0.892    0.006 -149.409    0.000   -0.892   -0.406
    phys_accomp|t4    1.332    0.007  199.819    0.000    1.332    0.607
    work|t1          -8.077    0.065 -123.651    0.000   -8.077   -3.286
    work|t2          -4.292    0.019 -228.893    0.000   -4.292   -1.746
    work|t3          -1.150    0.007 -156.365    0.000   -1.150   -0.468
    work|t4           1.633    0.009  191.342    0.000    1.633    0.664
    emo_accomp|t1    -6.357    0.305  -20.810    0.000   -6.357   -4.487
    emo_accomp|t2    -4.520    0.044 -102.336    0.000   -4.520   -3.191
    emo_accomp|t3    -2.727    0.021 -127.643    0.000   -2.727   -1.925
    emo_accomp|t4    -1.394    0.011 -123.288    0.000   -1.394   -0.984
    careful|t1       -5.240    0.113  -46.490    0.000   -5.240   -4.022
    careful|t2       -3.941    0.031 -126.126    0.000   -3.941   -3.025
    careful|t3       -2.619    0.017 -154.958    0.000   -2.619   -2.011
    careful|t4       -1.611    0.011 -151.843    0.000   -1.611   -1.236
    pain|t1          -5.320    0.025 -214.707    0.000   -5.320   -2.759
    pain|t2          -2.591    0.008 -316.067    0.000   -2.591   -1.343
    pain|t3          -0.925    0.005 -177.655    0.000   -0.925   -0.480
    pain|t4           1.613    0.006  262.052    0.000    1.613    0.837
    social|t1        -3.972    0.023 -172.736    0.000   -3.972   -3.066
    social|t2        -2.570    0.008 -337.714    0.000   -2.570   -1.984
    social|t3        -1.128    0.004 -288.449    0.000   -1.128   -0.871
    social|t4        -0.136    0.003  -43.784    0.000   -0.136   -0.105
    calm|t1          -3.384    0.019 -176.651    0.000   -3.384   -2.992
    calm|t2          -2.164    0.007 -292.887    0.000   -2.164   -1.913
    calm|t3          -1.033    0.004 -261.680    0.000   -1.033   -0.913
    calm|t4          -0.398    0.003 -136.373    0.000   -0.398   -0.352
    calm|t5           1.724    0.006  295.304    0.000    1.724    1.524
    energy|t1        -3.421    0.012 -284.909    0.000   -3.421   -2.477
    energy|t2        -1.790    0.005 -366.769    0.000   -1.790   -1.296
    energy|t3        -0.279    0.003  -83.858    0.000   -0.279   -0.202
    energy|t4         0.593    0.003  173.447    0.000    0.593    0.430
    energy|t5         3.275    0.011  295.007    0.000    3.275    2.372
    blue|t1          -3.858    0.031 -125.147    0.000   -3.858   -3.322
    blue|t2          -2.811    0.012 -239.198    0.000   -2.811   -2.421
    blue|t3          -2.205    0.008 -270.873    0.000   -2.205   -1.899
    blue|t4          -0.911    0.004 -229.499    0.000   -0.911   -0.784
    blue|t5           0.411    0.003  136.073    0.000    0.411    0.354

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .gen               1.000                               1.000    0.499
   .mod_active        1.000                               1.000    0.503
   .stairs            1.000                               1.000    0.322
   .phys_accomp       1.000                               1.000    0.207
   .work              1.000                               1.000    0.166
   .pain              1.000                               1.000    0.269
   .social            1.000                               1.000    0.596
   .energy            1.000                               1.000    0.524
   .emo_accomp        1.000                               1.000    0.498
   .careful           1.000                               1.000    0.589
   .calm              1.000                               1.000    0.782
   .blue              1.000                               1.000    0.742
    Phys              0.984    0.007  139.084    0.000    1.000    1.000
    Ment              1.007    0.031   32.660    0.000    1.000    1.000

Scales y*:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    gen               0.706                               0.706    1.000
    mod_active        0.710                               0.710    1.000
    stairs            0.567                               0.567    1.000
    phys_accomp       0.455                               0.455    1.000
    work              0.407                               0.407    1.000
    pain              0.519                               0.519    1.000
    social            0.772                               0.772    1.000
    energy            0.724                               0.724    1.000
    emo_accomp        0.706                               0.706    1.000
    careful           0.768                               0.768    1.000
    calm              0.884                               0.884    1.000
    blue              0.861                               0.861    1.000

R-Square:
                   Estimate
    gen               0.501
    mod_active        0.497
    stairs            0.678
    phys_accomp       0.793
    work              0.834
    pain              0.731
    social            0.404
    energy            0.476
    emo_accomp        0.502
    careful           0.411
    calm              0.218
    blue              0.258

Upvotes: 2

Views: 60

Answers (0)

Related Questions