Reputation: 401
This is a follow-up to my previous post on SO.
I am trying to produce a frequency table of demographics, including race, sex, and ethnicity. One table is a crosstab of race by sex for Hispanic participants in a study. However, there are no Hispanic participants thus far. So, the table will be all zeroes, but we still have to report it.
This can be done in R, but so far, I have found no solution for SAS. Example data is below.
data race;
input race eth sex ;
cards;
1 2 1
1 2 1
1 2 2
2 2 1
2 2 2
2 2 1
3 2 2
3 2 2
3 2 1
4 2 2
4 2 1
4 2 2
run;
data class;
do race = 1,2,3,4,5,6,7;
do eth = 1,2,3;
do sex = 1,2;
output;
end;
end;
end;
run;
proc format;
value frace 1 = "American Indian / AK Native"
2 = "Asian"
3 = "Black or African American"
4 = "Native Hawiian or Other PI"
5 = "White"
6 = "More than one race"
7 = "Unknown or not reported" ;
value feth 1 = "Hispanic or Latino"
2 = "Not Hispanic or Latino"
3 = "Unknown or Not reported" ;
value fsex 1 = "Male"
2 = "Female" ;
run;
***** ethnicity by sex ;
proc tabulate data = race missing classdata=class ;
class race eth sex ;
table eth, sex / misstext = '0' printmiss;
format race frace. eth feth. sex fsex. ;
run;
***** race by sex ;
proc tabulate data = race missing classdata=class ;
class race eth sex ;
table race, sex / misstext = '0' printmiss;
format race frace. eth feth. sex fsex. ;
run;
***** race by sex, for Hispanic only ;
***** log indicates that a logical page with only missing values has been deleted ;
***** Thanks SAS, you're a big help... ;
proc tabulate data = race missing classdata=class ;
where eth = 1 ;
class race eth sex ;
table race, sex / misstext = '0' printmiss;
format race frace. eth feth. sex fsex. ;
run;
I understand that the code really can't work because I'm selecting where eth
is equal to 1 (there are no cases satisfying the condition...). Specifying the command to be run by eth
doesn't work either.
Any guidance is greatly appreciated...
Upvotes: 0
Views: 1097
Reputation: 4282
I found this paper to be very informative:
Oh No, a Zero Row: 5 Ways to Summarize Absolutely Nothing
The preloadfmt
option in proc means (Method 5) is my favorite. Once you create the necessary formats it's not necessary to add dummy data. It's odd that they haven't yet added this option to proc freq.
Upvotes: 0
Reputation: 401
Looks like our good friends at Westat have worked with this issue. A description of there solution is shown here.
The code is shown below for convenience, but please cite the original when referenced
PROC FORMAT;
value ethnicf
1 = 'Hispanic or Latino'
2 = 'Not Hispanic or Latino'
3 = 'Unknown (Individuals Not Reporting Ethnicity)';
value racef
1 = 'American Indian or Alaska Native'
2 = 'Asian'
3 = 'Native Hawaiian or Other Pacific Islander'
4 = 'Black or African American'
5 = 'White'
6 = 'More Than One Race'
7 = 'Unknown or Not Reported';
value gndrf
1 = 'Male'
2 = 'Female'
3 = 'Unknown or Not Reported';
RUN;
DATA shelldata;
format ethlbl ethnicf. racelbl racef. gender gndrf.;
do ethcat = 1 to 2;
do ethlbl = 1 to 3;
do racelbl = 1 to 7;
do gender = 1 to 3;
output;
end;
end;
end;
end;
RUN;
DATA test;
input pt $ 1-3 ethlbl gender racelbl ;
cards;
x1 2 1 5
x2 2 1 5
x3 2 1 5
x4 2 1 5
x5 2 1 5
x6 2 2 2
x7 2 2 2
x8 2 2 5
x9 2 2 4
x10 2 2 4
RUN;
DATA enroll;
set test;
if ethlbl = 1 then ethcat = 1;
else ethcat = 2;
format ethlbl ethnicf. racelbl racef. gender gndrf.;
label ethlbl = 'Ethnic Category'
racelbl = 'Racial Categories'
gender = 'Sex/Gender';
RUN;
%MACRO TAB_WHERE;
/* PROC SQL step creates a macro variable whose */
/* value will be the number of observations */
/* meeting WHERE clause criteria. */
PROC SQL noprint;
select count(*)
into :numobs
from enroll
where ethcat=1;
QUIT;
/* PROC FORMAT step to display all numeric values as zero. */
PROC FORMAT;
value allzero low-high=' 0';
RUN;
/* Conditionally execute steps when no observations met criteria. */
%if &numobs=0 %then
%do;
%let fmt = allzero.; /* Print all cell values as zeroes */
%let str = ; /*No Cases in Subset - WHERE cannot be used */
%end;
%else
%do;
%let fmt = 8.0;
%let str = where ethcat = 1;
%end;
PROC TABULATE data=enroll classdata=shelldata missing format=&fmt;
&str;
format racelbl racef. gender gndrf.;
class racelbl gender;
classlev racelbl gender;
keyword n pctn all;
tables (racelbl all='Racial Categories: Total of Hispanic or Latinos'),
gender='Sex/Gender'*N=' ' all='Total'*n='' / printmiss misstext='0'
box=[LABEL=' '];
title1 font=arial color=darkblue h=1.5 'Inclusion Enrollment Report';
title2 ' ';
title3 font=arial color=darkblue h=1' PART B. HISPANIC ENROLLMENT REPORT:
Number of Hispanic or Latinos Enrolled to Date (Cumulative)';
RUN;
%MEND TAB_WHERE;
%TAB_WHERE
Upvotes: 1
Reputation: 63434
I think the easiest way is to create a row in the data that has the missing value. You could look at the following paper for suggestions as to how to do this on a larger scale:
http://www.nesug.org/Proceedings/nesug11/pf/pf02.pdf
PROC FREQ has the SPARSE option, which gives you all possible combinations of all variables in the table (including missing ones), but it doesn't look like that gives you exactly what you need.
Upvotes: 1