user2388178
user2388178

Reputation: 11

Polychoric correlation (Stata) using multiple imputations and a complex sample design

I have a data base (I use Stata 13) that has multiple imputations with a complex sample design (Strate and Pweight), so I generally use the following command before my analysis : mi estimate, esampvaryok:svy:

I just want to know is there any way to use the polychoric command in Stata in that context? Or, if it's not possible, do you know other software that would allow me to do so?

Upvotes: 1

Views: 1663

Answers (1)

Steve Samuels
Steve Samuels

Reputation: 908

Apply polychoric to each imputation data set and then average the results. Although polychoric is not survey-aware, only the probability weights are needed to estimate the correlations. Here's code that computes two estimates of the correlations: 1) the average of the individual correlations from polychoric; 2) an estimate based on the average inverse-hyperbolic-tangent transform of those correlations. In the example they are similar, but I'd usually prefer the latter. See: Nick Cox, Speaking Stata: Correlation with confidence, or Fisher’s z revisited. The Stata Journal (2008) 8, Number 3, pp. 413–439. http://www.stata-journal.com/article.html?article=pr0041.

Update : Output the correlations to Stata matrices and added row & column names

/* Create MI data set */
set seed 43228226
sysuse auto, clear
recode rep78 1/2=3
replace foreign = . in 3/5
mi set flong
mi register impute rep78 foreign
mi impute chained ///
(ologit) rep78  ///
(logit) foreign =  turn weight mpg ///
[pw = turn], add(5) double

/* Set up polychoric */

/* local macro with variables */
local pvars rep78 foreign mpg weight

local nvars : word count("`pvars'")
mata: nv = strtoreal(st_local("nvars"))

qui mi des
mata: nreps = st_numscalar("r(M)")

/* loop through MI data sets to get sums*/
mata: sum_r = J(nv,nv,0)
mata: sum_atr = J(nv,nv,0)

forvalues i = 1/`=r(M)'{
qui polychoric `pvars' [pw = turn] if _mi_m==`i'
mata: r = st_matrix("r(R)")
mata: sum_r = sum_r  + r
mata: sum_atr = sum_atr +atanh(r)
}
/* Now average and get estimates */
mata:
st_matrix("rho1",sum_r/nreps)
/* For version based on atanh:
1) Get average of atanh transforms.
2) Diagonal elements are missing; substitute 0s.
3) Back transform to correlation scale.
4) Put 1s on diagonal.
5) Make the matrix symmetric. */
 st_matrix("rho2",  ///
 makesymmetric(tanh(lowertriangle(sum_atr/nreps,0))) +I(nv))
end


/* rho1 : average correlations
   rho2 : back transform of average atanh(r) */
forvalues i = 1/2 {
mat colnames rho`i' = `pvars'
mat rownames rho`i' = `pvars'
mat list rho`i'
}

Update 2: Mostly Mata

/* Set up variables & pweight  */
local pcvars rep78 foreign mpg weight
local pwtvar turn

mata:
stata("qui mi query")
M= st_numscalar("r(M)") 

vnames = tokens(st_local("pcvars"))
nv = cols(vnames)

/*Initialize sums for average numerators */
sum_r = J(nv,nv,0)
sum_atr = J(nv,nv,0)

/* Run -polychoric- on each imputed data set */
for (j = 1; j<=M; j++) {
  st_numscalar("j",j)
  stata("qui polychoric `pcvars' [pw = `pwtvar'] if _mi_m==j")
  r = st_matrix("r(R)")
  sum_r = sum_r  + r
  sum_atr = sum_atr + atanh(r)
}
/* Create Stata correlation matrices from average over imputations*/
st_matrix("rho1",sum_r/M)
st_matrix("rho2",  ///
  makesymmetric(tanh(lowertriangle(sum_atr/M,0))) +I(nv))

/* Label rows & columns */
c =  (J(nv,1,""),vnames')
st_matrixrowstripe("rho1",c)
st_matrixcolstripe("rho1",c)

st_matrixrowstripe("rho2",c)
st_matrixcolstripe("rho2",c)
end

mat list rho1
mat list rho2

Upvotes: 1

Related Questions