Reputation: 49
I have a problem with Stata and the command:
svy: tab x
when there are no observations for a given x
.
My problem is that when there are no observations for a category Stata simply drops the corresponding row.
My task is to run several tabulations and to save and export the key results to a csv file. Sometimes the stored vectors have n elements while sometimes because of the zeros they have only n-1 elements so I don't know how to combine them in a larger matrix (or at least export it in a file with a regular spacing between rows and with a value of 0 if there are not observations). I have also tried the
estpost svy, subpop(x0): tab x, count se format(%10.4g)
But I still have the same problem.
Upvotes: 1
Views: 1739
Reputation: 908
Update 3
This solution is based on estpost svy: tab
because that command returns more usable result vectors than does svy: tab
itself. Like the previous version, this solution puts all those results into a Stata data set. It adds a check for whether the data contain missing categories before resorting to the loop and tightens the loop limits slightly. Following Nick's suggestion, missing values are substituted for all standard-error related statistics. Note that
estpost svy: tab rep78
by default puts the estimated cell proportions into e(b)
and their standard errors into e(se)
, while
estpost svy: tab rep78, count
puts the estimated counts and their SEs into those matrices. However the other summary is still available, either in e(cell)
or e(count)
.
sysuse auto, clear
drop if rep78==2 |rep78==5
svyset _n [pw = turn]
estpost svy: tab rep78, se
/* Number categories from 1 to max */
local maxcat = 5
mata:
/* count rows, add one for totals row
assign the category for that row as .a */
r = (st_matrix("e(Row)"), .a)'
b = st_matrix("e(b)")'
serr = st_matrix("e(se)")'
lb = st_matrix("e(lb)")'
ub = st_matrix("e(ub)")'
def = st_matrix("e(deff)")'
dft = st_matrix("e(deft)")'
ct = st_matrix("e(count)")'
pr = st_matrix("e(cell)")'
obs = st_matrix("e(obs)")'
d1 =(r , b, serr, lb, ub, def, dft, obs, pr, ct)
/* Where there are no totals, use a standard missing value */
d1[rows(d1),3::7] = J(1,5, .)
/* Check if there are no missing rows.
If so, output the original returned matrices */
if (`e(r)' ==`maxcat') d = d1
/* Else create a zero matrix and populate it
with statistics for the non-missing categories*/
else {
d2= J(`maxcat',10,0)
d2[.,1] =(1::`maxcat')
for (j = 1; j<=`e(r)'; j++) {
for (k = 1; k<=r[j,1]; k++) {
if (r[j,1]== k) {
d2[k,2] = b[j,1]
d2[k,3] = serr[j,1]
d2[k,4] = lb[j,1]
d2[k,5] = ub[j,1]
d2[k,6] = def[j,1]
d2[k,7] = dft[j,1]
d2[k,8] = obs[j,1]
d2[k,9] = pr[j,1]
d2[k,10] = ct[j,1]
}
}
}
/* If rows are missing set SE-realated stats to missing*/
for (k = 1; k<=`maxcat'; k++) {
if (d2[k,2] == 0) d2[k,3..7] =J(1,5,.)
}
/* Now add the totals row */
d = d2 \ d1[rows(d1),.]
}
end
clear
getmata (rep78 b se lb ub deff deft nobs prop count ) = d
format b se lb ub deff deft prop %5.2f
format nobs count %10.0gc
label define rtot .a "Totals"
label values rep78 rtot
list
save results, replace
Original Answer Here's a way to create a matrix new
that will include the zero categories. The logic: set up a matrix of zeros to hold results for all the categories; then replace the zeros with values from the non-missing categories.The macro maxcat
contains the maximum number of categories for the tabulated variable. The code assumes that the categories in the tabled variables are integers from 1 to maxcat
. The mata
block extracts the vector of standard errors, and the scalar e(r)
holds the number of rows in the actual table.
sysuse auto, clear
svyset _n
drop if rep78== 2 | rep78==5
svy: tab rep78, count se
local maxcat = 5 //max no. of categories
matrix oldr = e(Row)' // category values
matrix ct = e(Obs) // table counts
// serr is a vector of std. errors
mata: st_matrix("serr", sqrt(diagonal(st_matrix("e(V)"))))
// matrix new will hold the expanded results
matrix new = J(`maxcat', 3, 0)
forvalues j = 1/`=e(r)' {
forvalues k = 1/`maxcat'{
matrix new[`k',1] = `k'
if oldr[`j',1]== `k' {
matrix new[`k',2] = ct[`j',1]
matrix new[`k',3] = serr[`j',1]
}
}
}
matrix list new
Update 2: Here's a version that does most of the work in Mata, then saves the estimates into a Stata data set. I've changed the names of the matrices slightly.
sysuse auto, clear
svyset _n
drop if rep78== 2 | rep78==5
svy: tab rep78, count se
local maxcat =5
mata:
r = st_matrix("e(Row)")'
ct = st_matrix("e(Obs)")
serr= sqrt(diagonal(st_matrix("e(V)")))
d = J(`maxcat',3,0)
for (j = 1; j<=`e(r)'; j++) {
for (k = 1; k<=`maxcat'; k++) {
d[k,1] = k
if (r[j,1]== k) {
d[k,2] = ct[j,1]
d[k,3] = serr[j,1]
}
}
}
end
clear
getmata (rep78 count se) = d
replace se = . if count==0
format se %8.2f
list
save results, replace
Upvotes: 1