Reputation: 49
I have a problem with Stata and the command:
svy: tab x
when there are no observations for a given x
My problem is that when there are no observations for a category Stata simply drops the corresponding row.
My task is to run several tabulations and to save and export the key results to a csv file. Sometimes the stored vectors have n elements while sometimes because of the zeros they have only n-1 elements so I don't know how to combine them in a larger matrix (or at least export it in a file with a regular spacing between rows and with a value of 0 if there are not observations). I have also tried the
estpost svy, subpop(x0): tab x, count se format(%10.4g)
But I still have the same problem.
Upvotes: 1
Views: 1739
Reputation: 908
Update 3
This solution is based on estpost svy: tab
because that command returns more usable result vectors than does svy: tab
itself. Like the previous version, this solution puts all those results into a Stata data set. It adds a check for whether the data contain missing categories before resorting to the loop and tightens the loop limits slightly. Following Nick's suggestion, missing values are substituted for all standard-error related statistics. Note that
estpost svy: tab rep78
by default puts the estimated cell proportions into e(b)
and their standard errors into e(se)
, while
estpost svy: tab rep78, count
puts the estimated counts and their SEs into those matrices. However the other summary is still available, either in e(cell)
or e(count)
sysuse auto, clear
drop if rep78==2 |rep78==5
svyset _n [pw = turn]
estpost svy: tab rep78, se
/* Number categories from 1 to max */
local maxcat = 5
/* count rows, add one for totals row
assign the category for that row as .a */
r = (st_matrix("e(Row)"), .a)'
b = st_matrix("e(b)")'
serr = st_matrix("e(se)")'
lb = st_matrix("e(lb)")'
ub = st_matrix("e(ub)")'
def = st_matrix("e(deff)")'
dft = st_matrix("e(deft)")'
ct = st_matrix("e(count)")'
pr = st_matrix("e(cell)")'
obs = st_matrix("e(obs)")'
d1 =(r , b, serr, lb, ub, def, dft, obs, pr, ct)
/* Where there are no totals, use a standard missing value */
d1[rows(d1),3::7] = J(1,5, .)
/* Check if there are no missing rows.
If so, output the original returned matrices */
if (`e(r)' ==`maxcat') d = d1
/* Else create a zero matrix and populate it
with statistics for the non-missing categories*/
else {
d2= J(`maxcat',10,0)
d2[.,1] =(1::`maxcat')
for (j = 1; j<=`e(r)'; j++) {
for (k = 1; k<=r[j,1]; k++) {
if (r[j,1]== k) {
d2[k,2] = b[j,1]
d2[k,3] = serr[j,1]
d2[k,4] = lb[j,1]
d2[k,5] = ub[j,1]
d2[k,6] = def[j,1]
d2[k,7] = dft[j,1]
d2[k,8] = obs[j,1]
d2[k,9] = pr[j,1]
d2[k,10] = ct[j,1]
/* If rows are missing set SE-realated stats to missing*/
for (k = 1; k<=`maxcat'; k++) {
if (d2[k,2] == 0) d2[k,3..7] =J(1,5,.)
/* Now add the totals row */
d = d2 \ d1[rows(d1),.]
getmata (rep78 b se lb ub deff deft nobs prop count ) = d
format b se lb ub deff deft prop %5.2f
format nobs count %10.0gc
label define rtot .a "Totals"
label values rep78 rtot
save results, replace
Original Answer Here's a way to create a matrix new
that will include the zero categories. The logic: set up a matrix of zeros to hold results for all the categories; then replace the zeros with values from the non-missing categories.The macro maxcat
contains the maximum number of categories for the tabulated variable. The code assumes that the categories in the tabled variables are integers from 1 to maxcat
. The mata
block extracts the vector of standard errors, and the scalar e(r)
holds the number of rows in the actual table.
sysuse auto, clear
svyset _n
drop if rep78== 2 | rep78==5
svy: tab rep78, count se
local maxcat = 5 //max no. of categories
matrix oldr = e(Row)' // category values
matrix ct = e(Obs) // table counts
// serr is a vector of std. errors
mata: st_matrix("serr", sqrt(diagonal(st_matrix("e(V)"))))
// matrix new will hold the expanded results
matrix new = J(`maxcat', 3, 0)
forvalues j = 1/`=e(r)' {
forvalues k = 1/`maxcat'{
matrix new[`k',1] = `k'
if oldr[`j',1]== `k' {
matrix new[`k',2] = ct[`j',1]
matrix new[`k',3] = serr[`j',1]
matrix list new
Update 2: Here's a version that does most of the work in Mata, then saves the estimates into a Stata data set. I've changed the names of the matrices slightly.
sysuse auto, clear
svyset _n
drop if rep78== 2 | rep78==5
svy: tab rep78, count se
local maxcat =5
r = st_matrix("e(Row)")'
ct = st_matrix("e(Obs)")
serr= sqrt(diagonal(st_matrix("e(V)")))
d = J(`maxcat',3,0)
for (j = 1; j<=`e(r)'; j++) {
for (k = 1; k<=`maxcat'; k++) {
d[k,1] = k
if (r[j,1]== k) {
d[k,2] = ct[j,1]
d[k,3] = serr[j,1]
getmata (rep78 count se) = d
replace se = . if count==0
format se %8.2f
save results, replace
Upvotes: 1