D.A. Hannan
D.A. Hannan

Reputation: 31

R: Structural Equation Modeling, Item Parceling

I am attempting to construct a structural equation model in R for the relationships between latent variables "aptitude" and "faculty/curriculum effectiveness," in a set of de-identified medical education data. In an attempt to preserve as much of the data as possible, I want to include the test scores of all of the exams that a medical student takes in "blocks" in the first two years of medical school (denoted MS1 and MS2). Each block of exams covers different category material and has a different number of exams. Eventually, this would lead to a larger structural model assesing the relationship between the latent variables above and the USMLE STEP1 qualification exam which assesses all of med school years 1 and 2, with the hope of itendifying which blocks have a stronger relationship with the scores on this exam, mediated by the other latent variables. For ease, each exam in the data.frame all.exams is specified by which block and in what order in the block it was taken:

   head(all.exams)
   BLK1.1 BLK1.2 BLK1.3 BLK1.4 BLK2.1 BLK2.2 BLK2.3 BLK3.1 BLK3.2
66   66.7   87.8   50.0   82.4   81.8  100.0   87.2   83.3   69.7
67  100.0   95.9  100.0   97.1  100.0  100.0   94.9  100.0  100.0
68  100.0   91.9   66.7   88.2  100.0  100.0   94.9   91.7   97.0
69  100.0   93.2   83.3   95.6   81.8  100.0   97.4   95.8   93.9
70  100.0   89.2   83.3   85.3  100.0  100.0   87.2   87.5  100.0
71   91.7   90.5   83.3   88.2   90.9   83.3   94.9   95.8  100.0
   BLK3.3 BLK4.1 BLK4.2 BLK5.1 BLK5.2 MS2BLK1.1 MS2BLK1.2 MS2BLK1.3
66   81.3    100   80.3   90.5  100.0        85        81        82
67   95.8    100   94.4  100.0  100.0        99        98        96
68   87.5    100   87.3   81.0   66.7        90        93        93
69   89.6    100   88.7  100.0  100.0        93        84        90
70   85.4    100   85.9   90.5  100.0        97        87        88
71   87.5    100   90.1   95.2  100.0        95        89        89
   MS2BLK1.4 MS2BLK2.1 MS2BLK2.2 MS2BLK2.3 MS2BLK3.1 MS2BLK3.2
66      90.8        82      74.3      89.3      78.4      80.0
67     100.0        95     100.0      98.7      99.2      95.2
68      95.4        94      95.7      93.3      95.2      95.2
69      95.4        91      97.1      93.3      84.8      92.0
70      93.9        94      92.9      94.7      85.6      82.4
71      95.4        94      92.9      93.3      92.0      92.0
   MS2BLK4.1 MS2BLK4.2 MS2BLK4.3 MS2BLK5.1 MS2BLK5.2 MS2BLK5.3 STEP1
66      75.6      80.3      82.3      82.4        74        93   193
67      97.5      93.8      97.5     100.0       100        99   251
68      89.9      95.1      84.8      93.6        94        93   242
69      85.7      92.6      91.1      88.0        91        95   226
70      82.4      81.5      92.4      90.4        94        93   233
71      89.9      88.9      83.5      96.0        97        90   231

This is an ideal data-set to which to apply "item-parceling," since we are more interested in the factor-loadings between the latent variables and each "block" of exams as opposed rather than the relationship between each individual exam and each latent variable.

semTools features a function parcelAllocation

https://www.rdocumentation.org/packages/semTools/versions/0.4-12/topics/parcelAllocation

which allows the user to combine manifest variables in a SEM into a specified number of parcels per latent variable and with a specified number of items within each parcel. According to the example included with the notes on semTools, the item syntax should look like:

item.syntax.full <- c(paste0("faculty =~ BLK1.", 1:4),
                 paste0("faculty =~ BLK2.", 1:3),
                 paste0("faculty=~BLK3.",1:3),

                 paste0("faculty=~BLK4.",1:2),

                 paste0("faculty=~BLK5.",1:2),

                 paste0("faculty=~MS2BLK1.",1:4),
                 paste0("faculty=~MS2BLK2.",1:3),
                 paste0("faculty=~MS2BLK3.",1:2),
                 paste0("faculty=~MS2BLK4.",1:3),
                 paste0("faculty=~MS2BLK5.",1:3),

                 paste0("aptitude =~ BLK1.", 1:4),
                 paste0("aptitude =~ BLK2.", 1:3),
                 paste0("aptitude=~BLK3.",1:3),

                 paste0("aptitude=~BLK4.",1:2),

                 paste0("aptitude=~BLK5.",1:2),
                 paste0("aptitude=~MS2BLK1.",1:4),
                 paste0("aptitude=~MS2BLK2.",1:3),
                 paste0("aptitude=~MS2BLK3.",1:2),
                 paste0("aptitude=~MS2BLK4.",1:3),
                 paste0("aptitude=~MS2BLK5.",1:3)             
    )

The lavaan syntax/style model is specified by the code:

  parcel.model="
faculty=~par1+par2+par3+par4+par5+par6+par7+par8+par9par10
aptitude=~par11+par12+par13+par14+par15+par16+par17+par18+par19+par20
"

Using semTools parcelAllocation function, the following code should fit a lavaan type structural equation model with two latent variables, and ten parcels containing the number of manifest items/variables specified by the nPerPar command in the function:

parcelAllocation(model=parcel.model,dataset=all.exams[,-30],nPerPar = list(c(4,3,3,2,2,4,3,2,3,3),c(4,3,3,2,2,4,3,2,3,3)),facPlc = list(apt.names,fac.names),nAlloc=20,syntax=item.syntax.full)

where,

fac.names=colnames(all.exams)
fac.names=c("faculty",fac.names[-30])
apt.names=colnames(all.exams)
apt.names=c("aptitude",apt.names[-30]) 

####the names of the latent variables and all of the manifest variables to be parceled- we exclude "STEP1" because it is not included in the lavaan model or the item.syntax####

However, when I run the above code I get the following error message:

Error in parcelAllocation(model = parcel.model, dataset = all.exams, nPerPar = list(c(4,  : 
  ** WARNING! ** Parcels incorrectly specified. Check input!

I have tried creating a simpler structural model, with 3 parcels per latent variable and with 3, 3 and 4 items respectively per parcel (totaling to the number of exams in the first two years of medical school (10) prior to the STEP1 examination):

 parcel.model.simp="
faculty=~par1+par2+par3
aptitude=~par4+par5+par6
"

and using the appropriately adjusted parcelAllocation code:

parcelAllocation(model=parcel.model.simp,dataset=all.exams[,-30],nPerPar = list(c(3,3,4),c(3,3,4)),facPlc =  list(apt.names,fac.names),nAlloc=20,syntax=item.syntax.full)

but it only yields the same error message as above:

How can I get this function to effectively parcel the exams into parcels corresponding to each block? What errors seem present in my code? Any suggestions or corrections to the parcelAllocation code or critical feedback on my SEM approach to this question in general would be extremely helpful. I have exhaustively searched for examples of parceling that are this complex and troubleshooting for this error message and have found neither.

Thank you,

David

Upvotes: 3

Views: 1848

Answers (1)

Terrence
Terrence

Reputation: 1250

I'm not sure how helpful this will be after over a year since the OP, but I find it hard to provide the requested help because the description does not seem to match the purpose of parcelAllocation(). You state that you want to "parcel the exams into parcels corresponding to each block", which means you do not want random parcels at all (which is what this function is for). Instead, you should simply create those parcels as new variables in your data by calculating a composite score (the sum or mean across exams within a block), then use those composites as indicators.

Some other problems:

  • You seem to want all items (at both Time 1 and Time 2) to measure both of the 2 factors (faculty and aptitude), which are not time-specific factors (i.e., faculty1, faculty2, aptitude1, aptitude2).
  • You are using syntax that seems to match an updated version of the parcelAllocation() function, which has features that were not available in the documentation page you linked to (version 0.4-12). Perhaps it will be clearer how the function works if you work through the examples in the updated documentation (version 0.5-2):

https://www.rdocumentation.org/packages/semTools/versions/0.5-2/topics/parcelAllocation

But again, it does not sound like you want to randomly allocate parcels, but rather to assign exams within a block to a parcel for that block. See this article about the advantages of "facet-representative parceling", which in your case would include all the within-block sources of variance in a single parcel, thus excluding those sources from the construct variance.

Good luck,

Terrence D. Jorgensen

Upvotes: 0

Related Questions