Reputation: 21
I am using secondary data to build a measurement model in R and, for proof of concept, I am trying to simulate data from this measurement model and fit it to the original model to show that it will produce similar parameter estimates.
The secondary data I am using to build the measurement model is provided in Table 3 of the following article: Kwon, J. Y., & Sawatzky, R. (2017). Examining gender-related differential item functioning of the Veterans Rand 12-item Health Survey. Quality of Life Research, 26, 2877-2883. https://doi.org/10.1007/s11136-017-1638-x
When I use the Kwon and Sawatzky (2017) model to simulate data and then use the simulated data to fit to the original measurement model, the recovery of the thresholds are near perfect, however, the factor loadings are a bit off and the covariances are very off.
I've being doing trial and error, and these are the best parameter estimates I can pull currently. I performed this process seamlessly using secondary data from another measure (the WHOQOL-BREF) where the recovery of the loadings, intercepts, and covariances were very similar. So, I must be doing something wrong here but I'm unsure what exactly. Below, I've provided the script I've used and the output I get, respectively.
Script
#populated with values from Kwon & Sawatzky (2017)
sf12_script<- '
Phys =~ 1*gen + .1.76*mod_active + 1.46*stairs + 1.94*phys_accomp + 2.24*work +
0*emo_accomp + 0*careful + 1.66*pain + .69*social + 0*calm +
.93*energy + 0*blue
Ment =~ .14*gen + 0*mod_active + 0*stairs + 0*phys_accomp + 0*work +
1*emo_accomp + .82*careful + 0*pain + .46*social + .53*calm +
.24*energy + .59*blue
#item thresholds
gen | -2.27*t1 + -0.77*t2 + 0.67*t3 + 2.21*t4
mod_active | -1.42*t1 + 0.33*t2
stairs | -0.91*t1 + 0.67*t2
phys_accomp | -3.04*t1 + -1.62*t2 + -0.41*t3 + 0.60*t4
work | -3.26*t1 + -1.74*t2 + -0.47*t3 + 0.66*t4
emo_accomp | -4.58*t1 + -3.18*t2 + -1.92*t3 + -0.98*t4
careful | -4.14*t1 + -3.00*t2 + -2.00*t3 + -1.23*t4
pain | -2.74*t1 + -1.34*t2 + -0.48*t3 +0.83*t4
social | -3.05*t1 + -1.98*t2 + -0.87*t3 + -.109*t4
calm | -2.99*t1 + -1.91*t2 + -0.91*t3 + -0.35*t4 + 1.53*t5
energy | -2.47*t1 + -1.29*t2 + -0.20*t3 + 0.43*t4 + 2.37*t5
blue | -3.33*t1 + -2.42*t2 + -1.90*t3 + -0.78*t4 + 0.35*t5
#between item correlations
mod_active ~~ 0.46*stairs
phys_accomp ~~ 0.41*work
emo_accomp ~~ 0.41*careful
calm ~~ 0.09*blue
'
#simulate data from the parameter estimates in Kwon & Sawatzky (2017)
sim_samp_sf12 <- simulateData(sf12_script, sample.nobs = 277518, parameterization = 'theta')
#the script for the two-factor model
sf12_empty_script <- '
Phys =~ 1*gen + mod_active + stairs + phys_accomp + work + pain + social + energy
Ment =~ 1*emo_accomp + gen + careful + social + calm + energy + blue
#populating item thresholds
gen | t1 + t2 + t3 + t4
mod_active | t1 + t2
stairs | t1 + t2
phys_accomp | t1 + t2 + t3 + t4
work | t1 + t2 + t3 + t4
emo_accomp | t1 + t2 + t3 + t4
careful | t1 + t2 + t3 + t4
pain | t1 + t2 + t3 + t4
social | t1 + t2 + t3 + t4
calm | t1 + t2 + t3 + t4 + t5
energy | t1 + t2 + t3 + t4 + t5
blue | t1 + t2 + t3 + t4 + t5
#between item correlations
mod_active ~~ stairs
phys_accomp ~~ work
emo_accomp ~~ careful
calm ~~ blue
'
#fit the model to the data generated from their sample values
sf12_fit_simsamp <- cfa(sf12_empty_script, data = sim_samp_sf12, std.lv = FALSE)
#check fit
summary(sf12_fit_simsamp, standardized = TRUE, fit.measures = TRUE, rsquare = TRUE)
Output
It may take a couple tries to get the model to converge, but when it does, this is the output:
lavaan 0.6-19 ended normally after 42 iterations
Estimator DWLS
Optimization method NLMINB
Number of model parameters 67
Number of observations 277518
Model Test User Model:
Standard Scaled
Test Statistic 35.633 55.199
Degrees of freedom 46 46
P-value (Chi-square) 0.865 0.166
Scaling correction factor 0.919
Shift parameter 16.428
simple second-order correction
Model Test Baseline Model:
Test statistic 6784656.349 3868007.538
Degrees of freedom 66 66
P-value 0.000 0.000
Scaling correction factor 1.754
User Model versus Baseline Model:
Comparative Fit Index (CFI) 1.000 1.000
Tucker-Lewis Index (TLI) 1.000 1.000
Robust Comparative Fit Index (CFI) 1.000
Robust Tucker-Lewis Index (TLI) 1.000
Root Mean Square Error of Approximation:
RMSEA 0.000 0.001
90 Percent confidence interval - lower 0.000 0.000
90 Percent confidence interval - upper 0.001 0.002
P-value H_0: RMSEA <= 0.050 1.000 1.000
P-value H_0: RMSEA >= 0.080 0.000 0.000
Robust RMSEA 0.001
90 Percent confidence interval - lower 0.000
90 Percent confidence interval - upper 0.002
P-value H_0: Robust RMSEA <= 0.050 1.000
P-value H_0: Robust RMSEA >= 0.080 0.000
Standardized Root Mean Square Residual:
SRMR 0.002 0.002
Parameter Estimates:
Parameterization Delta
Standard errors Robust.sem
Information Expected
Information saturated (h1) model Unstructured
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv
Phys =~
gen 1.000 0.704
mod_active 1.005 0.003 398.066 0.000 0.708
stairs 1.172 0.002 484.566 0.000 0.825
phys_accomp 1.264 0.002 506.441 0.000 0.890
work 1.298 0.003 512.201 0.000 0.914
pain 1.217 0.002 514.998 0.000 0.856
social 0.758 0.003 274.971 0.000 0.533
energy 0.951 0.002 407.182 0.000 0.670
Ment =~
emo_accomp 1.000 0.703
gen 0.138 0.004 38.977 0.000 0.097
careful 0.882 0.007 135.161 0.000 0.620
social 0.503 0.006 91.128 0.000 0.353
calm 0.664 0.010 67.562 0.000 0.467
energy 0.246 0.004 64.542 0.000 0.173
blue 0.722 0.010 69.175 0.000 0.508
Std.all
0.704
0.708
0.825
0.890
0.914
0.856
0.533
0.670
0.703
0.097
0.620
0.353
0.467
0.173
0.508
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv
.mod_active ~~
.stairs 0.185 0.002 122.614 0.000 0.185
.phys_accomp ~~
.work 0.076 0.001 69.446 0.000 0.076
.emo_accomp ~~
.careful 0.235 0.007 35.020 0.000 0.235
.calm ~~
.blue 0.068 0.004 17.882 0.000 0.068
Phys ~~
Ment 0.001 0.002 0.385 0.700 0.001
Std.all
0.463
0.409
0.422
0.090
0.001
Thresholds:
Estimate Std.Err z-value P(>|z|) Std.lv
gen|t1 -2.273 0.007 -338.199 0.000 -2.273
gen|t2 -0.769 0.003 -289.931 0.000 -0.769
gen|t3 0.671 0.003 259.767 0.000 0.671
gen|t4 2.213 0.006 348.863 0.000 2.213
mod_active|t1 -1.418 0.003 -406.378 0.000 -1.418
mod_active|t2 0.328 0.002 135.367 0.000 0.328
stairs|t1 -0.911 0.003 -328.370 0.000 -0.911
stairs|t2 0.668 0.003 258.736 0.000 0.668
phys_accomp|t1 -3.021 0.016 -186.734 0.000 -3.021
phys_accomp|t2 -1.623 0.004 -410.502 0.000 -1.623
phys_accomp|t3 -0.413 0.002 -168.227 0.000 -0.413
phys_accomp|t4 0.598 0.003 235.254 0.000 0.598
work|t1 -3.254 0.023 -144.007 0.000 -3.254
work|t2 -1.738 0.004 -406.277 0.000 -1.738
work|t3 -0.471 0.002 -190.042 0.000 -0.471
work|t4 0.659 0.003 255.619 0.000 0.659
emo_accomp|t1 -4.337 0.155 -27.900 0.000 -4.337
emo_accomp|t2 -3.213 0.021 -151.006 0.000 -3.213
emo_accomp|t3 -1.918 0.005 -391.297 0.000 -1.918
emo_accomp|t4 -0.978 0.003 -344.042 0.000 -0.978
careful|t1 -4.183 0.114 -36.806 0.000 -4.183
careful|t2 -3.011 0.016 -188.561 0.000 -3.011
careful|t3 -2.007 0.005 -380.594 0.000 -2.007
careful|t4 -1.236 0.003 -389.542 0.000 -1.236
pain|t1 -2.752 0.011 -241.369 0.000 -2.752
pain|t2 -1.344 0.003 -401.043 0.000 -1.344
pain|t3 -0.481 0.002 -193.789 0.000 -0.481
pain|t4 0.826 0.003 306.116 0.000 0.826
social|t1 -3.031 0.016 -184.689 0.000 -3.031
social|t2 -1.974 0.005 -384.755 0.000 -1.974
social|t3 -0.869 0.003 -317.556 0.000 -0.869
social|t4 -0.110 0.002 -46.123 0.000 -0.110
calm|t1 -2.978 0.015 -195.155 0.000 -2.978
calm|t2 -1.906 0.005 -392.568 0.000 -1.906
calm|t3 -0.906 0.003 -326.989 0.000 -0.906
calm|t4 -0.346 0.002 -142.478 0.000 -0.346
calm|t5 1.536 0.004 410.634 0.000 1.536
energy|t1 -2.467 0.008 -300.607 0.000 -2.467
energy|t2 -1.289 0.003 -395.730 0.000 -1.289
energy|t3 -0.202 0.002 -84.319 0.000 -0.202
energy|t4 0.427 0.002 173.582 0.000 0.427
energy|t5 2.371 0.007 319.706 0.000 2.371
blue|t1 -3.358 0.026 -126.846 0.000 -3.358
blue|t2 -2.413 0.008 -311.422 0.000 -2.413
blue|t3 -1.900 0.005 -393.287 0.000 -1.900
blue|t4 -0.779 0.003 -292.660 0.000 -0.779
blue|t5 0.351 0.002 144.389 0.000 0.351
Std.all
-2.273
-0.769
0.671
2.213
-1.418
0.328
-0.911
0.668
-3.021
-1.623
-0.413
0.598
-3.254
-1.738
-0.471
0.659
-4.337
-3.213
-1.918
-0.978
-4.183
-3.011
-2.007
-1.236
-2.752
-1.344
-0.481
0.826
-3.031
-1.974
-0.869
-0.110
-2.978
-1.906
-0.906
-0.346
1.536
-2.467
-1.289
-0.202
0.427
2.371
-3.358
-2.413
-1.900
-0.779
0.351
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv
.gen 0.495 0.495
.mod_active 0.499 0.499
.stairs 0.319 0.319
.phys_accomp 0.209 0.209
.work 0.165 0.165
.pain 0.267 0.267
.social 0.590 0.590
.energy 0.521 0.521
.emo_accomp 0.506 0.506
.careful 0.616 0.616
.calm 0.782 0.782
.blue 0.742 0.742
Phys 0.496 0.002 274.004 0.000 1.000
Ment 0.494 0.008 64.706 0.000 1.000
Std.all
0.495
0.499
0.319
0.209
0.165
0.267
0.590
0.521
0.506
0.616
0.782
0.742
1.000
1.000
R-Square:
Estimate
gen 0.505
mod_active 0.501
stairs 0.681
phys_accomp 0.791
work 0.835
pain 0.733
social 0.410
energy 0.479
emo_accomp 0.494
careful 0.384
calm 0.218
blue 0.258
My best guesses:
Something is going wrong because I'm using parameter estimates from a strict invariance model across sex and I need to add additional lavaan syntax to get the model to be correctly specified. What syntax, though, I'm unsure about.
I'm wondering if including the factor means and co-varainces parameters would improve recovery? I wasn't sure which though as they provide two values (one for males, one for females). Is this a scenario where I can choose one and, as long as I'm consistent with the group I choose, it should be fine? If so, then I've attempted that (using values provided from just females and just males) and that also didn't seem to improve recovery.
Something in the Methods/Data Analysis/Results section of the Kwon & Sawatzky (2017) article has gone completely over my head and I've specified the model wrong.
The issues with model convergence is not due to bad draws in random probability sampling and is reflective of a deeper (unknown to me) issue.
UPDATED OUTPUT WITH parameterization = theta
in cfa()
*
lavaan 0.6-19 ended normally after 158 iterations
Estimator DWLS
Optimization method NLMINB
Number of model parameters 67
Number of observations 277518
Model Test User Model:
Standard Scaled
Test Statistic 21.883 40.243
Degrees of freedom 46 46
P-value (Chi-square) 0.999 0.711
Scaling correction factor 0.909
Shift parameter 16.164
simple second-order correction
Model Test Baseline Model:
Test statistic 6652355.468 3798046.688
Degrees of freedom 66 66
P-value 0.000 0.000
Scaling correction factor 1.752
User Model versus Baseline Model:
Comparative Fit Index (CFI) 1.000 1.000
Tucker-Lewis Index (TLI) 1.000 1.000
Robust Comparative Fit Index (CFI) 1.000
Robust Tucker-Lewis Index (TLI) 1.000
Root Mean Square Error of Approximation:
RMSEA 0.000 0.000
90 Percent confidence interval - lower 0.000 0.000
90 Percent confidence interval - upper 0.000 0.001
P-value H_0: RMSEA <= 0.050 1.000 1.000
P-value H_0: RMSEA >= 0.080 0.000 0.000
Robust RMSEA 0.001
90 Percent confidence interval - lower 0.000
90 Percent confidence interval - upper 0.003
P-value H_0: Robust RMSEA <= 0.050 1.000
P-value H_0: Robust RMSEA >= 0.080 0.000
Standardized Root Mean Square Residual:
SRMR 0.001 0.001
Parameter Estimates:
Parameterization Theta
Standard errors Robust.sem
Information Expected
Information saturated (h1) model Unstructured
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Phys =~
gen 1.000 0.992 0.701
mod_active 1.001 0.005 199.947 0.000 0.993 0.705
stairs 1.463 0.007 204.028 0.000 1.452 0.824
phys_accomp 1.970 0.011 184.910 0.000 1.955 0.890
work 2.263 0.013 170.206 0.000 2.245 0.914
pain 1.662 0.008 211.687 0.000 1.649 0.855
social 0.688 0.004 176.206 0.000 0.683 0.527
energy 0.928 0.004 213.266 0.000 0.921 0.667
Ment =~
emo_accomp 1.000 1.003 0.708
gen 0.136 0.004 34.241 0.000 0.136 0.096
careful 0.832 0.011 77.908 0.000 0.835 0.641
social 0.456 0.008 55.626 0.000 0.458 0.354
calm 0.526 0.012 42.346 0.000 0.528 0.467
energy 0.239 0.005 48.297 0.000 0.240 0.174
blue 0.588 0.014 42.056 0.000 0.590 0.508
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.mod_active ~~
.stairs 0.458 0.003 154.632 0.000 0.458 0.458
.phys_accomp ~~
.work 0.407 0.004 99.675 0.000 0.407 0.407
.emo_accomp ~~
.careful 0.406 0.008 50.231 0.000 0.406 0.406
.calm ~~
.blue 0.087 0.005 18.869 0.000 0.087 0.087
Phys ~~
Ment 0.004 0.003 1.177 0.239 0.004 0.004
Thresholds:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
gen|t1 -3.215 0.010 -314.839 0.000 -3.215 -2.271
gen|t2 -1.096 0.004 -275.397 0.000 -1.096 -0.774
gen|t3 0.942 0.004 247.455 0.000 0.942 0.666
gen|t4 3.135 0.010 317.890 0.000 3.135 2.215
mod_active|t1 -2.006 0.006 -352.683 0.000 -2.006 -1.424
mod_active|t2 0.469 0.003 134.990 0.000 0.469 0.332
stairs|t1 -1.606 0.006 -273.788 0.000 -1.606 -0.911
stairs|t2 1.189 0.005 231.426 0.000 1.189 0.674
phys_accomp|t1 -6.745 0.043 -156.841 0.000 -6.745 -3.072
phys_accomp|t2 -3.560 0.014 -263.441 0.000 -3.560 -1.622
phys_accomp|t3 -0.892 0.006 -149.409 0.000 -0.892 -0.406
phys_accomp|t4 1.332 0.007 199.819 0.000 1.332 0.607
work|t1 -8.077 0.065 -123.651 0.000 -8.077 -3.286
work|t2 -4.292 0.019 -228.893 0.000 -4.292 -1.746
work|t3 -1.150 0.007 -156.365 0.000 -1.150 -0.468
work|t4 1.633 0.009 191.342 0.000 1.633 0.664
emo_accomp|t1 -6.357 0.305 -20.810 0.000 -6.357 -4.487
emo_accomp|t2 -4.520 0.044 -102.336 0.000 -4.520 -3.191
emo_accomp|t3 -2.727 0.021 -127.643 0.000 -2.727 -1.925
emo_accomp|t4 -1.394 0.011 -123.288 0.000 -1.394 -0.984
careful|t1 -5.240 0.113 -46.490 0.000 -5.240 -4.022
careful|t2 -3.941 0.031 -126.126 0.000 -3.941 -3.025
careful|t3 -2.619 0.017 -154.958 0.000 -2.619 -2.011
careful|t4 -1.611 0.011 -151.843 0.000 -1.611 -1.236
pain|t1 -5.320 0.025 -214.707 0.000 -5.320 -2.759
pain|t2 -2.591 0.008 -316.067 0.000 -2.591 -1.343
pain|t3 -0.925 0.005 -177.655 0.000 -0.925 -0.480
pain|t4 1.613 0.006 262.052 0.000 1.613 0.837
social|t1 -3.972 0.023 -172.736 0.000 -3.972 -3.066
social|t2 -2.570 0.008 -337.714 0.000 -2.570 -1.984
social|t3 -1.128 0.004 -288.449 0.000 -1.128 -0.871
social|t4 -0.136 0.003 -43.784 0.000 -0.136 -0.105
calm|t1 -3.384 0.019 -176.651 0.000 -3.384 -2.992
calm|t2 -2.164 0.007 -292.887 0.000 -2.164 -1.913
calm|t3 -1.033 0.004 -261.680 0.000 -1.033 -0.913
calm|t4 -0.398 0.003 -136.373 0.000 -0.398 -0.352
calm|t5 1.724 0.006 295.304 0.000 1.724 1.524
energy|t1 -3.421 0.012 -284.909 0.000 -3.421 -2.477
energy|t2 -1.790 0.005 -366.769 0.000 -1.790 -1.296
energy|t3 -0.279 0.003 -83.858 0.000 -0.279 -0.202
energy|t4 0.593 0.003 173.447 0.000 0.593 0.430
energy|t5 3.275 0.011 295.007 0.000 3.275 2.372
blue|t1 -3.858 0.031 -125.147 0.000 -3.858 -3.322
blue|t2 -2.811 0.012 -239.198 0.000 -2.811 -2.421
blue|t3 -2.205 0.008 -270.873 0.000 -2.205 -1.899
blue|t4 -0.911 0.004 -229.499 0.000 -0.911 -0.784
blue|t5 0.411 0.003 136.073 0.000 0.411 0.354
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.gen 1.000 1.000 0.499
.mod_active 1.000 1.000 0.503
.stairs 1.000 1.000 0.322
.phys_accomp 1.000 1.000 0.207
.work 1.000 1.000 0.166
.pain 1.000 1.000 0.269
.social 1.000 1.000 0.596
.energy 1.000 1.000 0.524
.emo_accomp 1.000 1.000 0.498
.careful 1.000 1.000 0.589
.calm 1.000 1.000 0.782
.blue 1.000 1.000 0.742
Phys 0.984 0.007 139.084 0.000 1.000 1.000
Ment 1.007 0.031 32.660 0.000 1.000 1.000
Scales y*:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
gen 0.706 0.706 1.000
mod_active 0.710 0.710 1.000
stairs 0.567 0.567 1.000
phys_accomp 0.455 0.455 1.000
work 0.407 0.407 1.000
pain 0.519 0.519 1.000
social 0.772 0.772 1.000
energy 0.724 0.724 1.000
emo_accomp 0.706 0.706 1.000
careful 0.768 0.768 1.000
calm 0.884 0.884 1.000
blue 0.861 0.861 1.000
R-Square:
Estimate
gen 0.501
mod_active 0.497
stairs 0.678
phys_accomp 0.793
work 0.834
pain 0.731
social 0.404
energy 0.476
emo_accomp 0.502
careful 0.411
calm 0.218
blue 0.258
Upvotes: 2
Views: 60