blue and grey
blue and grey

Reputation: 401

Using SAS, is it possible to get a frequency table where no data exist?

This is a follow-up to my previous post on SO.

I am trying to produce a frequency table of demographics, including race, sex, and ethnicity. One table is a crosstab of race by sex for Hispanic participants in a study. However, there are no Hispanic participants thus far. So, the table will be all zeroes, but we still have to report it.

This can be done in R, but so far, I have found no solution for SAS. Example data is below.


data race;
input race  eth  sex   ;
cards;
1   2   1
1   2   1
1   2   2
2   2   1
2   2   2
2   2   1
3   2   2
3   2   2
3   2   1
4   2   2
4   2   1
4   2   2
run;




data class;
    do race = 1,2,3,4,5,6,7;
        do eth = 1,2,3;
            do sex = 1,2;
                output;
            end;
        end;
    end;
run;



proc format;

    value   frace   1 = "American Indian / AK Native"
                        2 = "Asian"
                        3 = "Black or African American"
                        4 = "Native Hawiian or Other PI"
                        5 = "White"
                        6 = "More than one race"
                        7 = "Unknown or not reported" ;

    value   feth            1 = "Hispanic or Latino"
                            2 = "Not Hispanic or Latino"
                            3 = "Unknown or Not reported" ;

    value   fsex        1 = "Male"
                        2 = "Female"  ;

run;






*****  ethnicity by sex  ;

proc tabulate data = race missing classdata=class ;
class  race eth sex ;
table eth, sex / misstext = '0' printmiss;
format race frace. eth feth.  sex fsex. ;
run;



*****  race by sex  ;

proc tabulate data = race missing classdata=class ;
class  race eth sex ;
table race, sex / misstext = '0' printmiss;
format race frace. eth feth.  sex fsex. ;
run;



*****  race by sex, for Hispanic only  ;
*****  log indicates that a logical page with only missing values has been deleted ;
*****  Thanks SAS, you're a big help...  ;

proc tabulate data = race missing classdata=class ;
where eth = 1 ;
class  race eth sex ;
table race, sex / misstext = '0' printmiss;
format race frace. eth feth.  sex fsex. ;
run;

I understand that the code really can't work because I'm selecting where eth is equal to 1 (there are no cases satisfying the condition...). Specifying the command to be run by eth doesn't work either.

Any guidance is greatly appreciated...

Upvotes: 0

Views: 1097

Answers (3)

Michael Richardson
Michael Richardson

Reputation: 4282

I found this paper to be very informative:

Oh No, a Zero Row: 5 Ways to Summarize Absolutely Nothing

The preloadfmt option in proc means (Method 5) is my favorite. Once you create the necessary formats it's not necessary to add dummy data. It's odd that they haven't yet added this option to proc freq.

Upvotes: 0

blue and grey
blue and grey

Reputation: 401

Looks like our good friends at Westat have worked with this issue. A description of there solution is shown here.

The code is shown below for convenience, but please cite the original when referenced


PROC FORMAT;
value ethnicf
1 = 'Hispanic or Latino'
2 = 'Not Hispanic or Latino'
3 = 'Unknown (Individuals Not Reporting Ethnicity)';
value racef
1 = 'American Indian or Alaska Native'
2 = 'Asian'
3 = 'Native Hawaiian or Other Pacific Islander'
4 = 'Black or African American'
5 = 'White'
6 = 'More Than One Race'
7 = 'Unknown or Not Reported';
value gndrf
1 = 'Male'
2 = 'Female'
3 = 'Unknown or Not Reported';
RUN;



DATA shelldata;
format ethlbl ethnicf. racelbl racef. gender gndrf.;
    do ethcat = 1 to 2;
        do ethlbl = 1 to 3;
            do racelbl = 1 to 7;
                do gender = 1 to 3;
                output;
                end;
            end;
        end;
    end;
RUN;



DATA test;
input pt $ 1-3 ethlbl gender racelbl ;
cards;
x1 2 1 5
x2 2 1 5
x3 2 1 5
x4 2 1 5
x5 2 1 5
x6 2 2 2
x7 2 2 2
x8 2 2 5
x9 2 2 4
x10 2 2 4
RUN;





DATA enroll;
set test;
if ethlbl = 1 then ethcat = 1;
else ethcat = 2;
format ethlbl ethnicf. racelbl racef. gender gndrf.;
label ethlbl = 'Ethnic Category'
racelbl = 'Racial Categories'
gender = 'Sex/Gender';
RUN;




%MACRO TAB_WHERE;


/* PROC SQL step creates a macro variable whose */
/* value will be the number of observations */
/* meeting WHERE clause criteria. */
PROC SQL noprint;
select count(*)
into :numobs
from enroll
where ethcat=1;
QUIT;


/* PROC FORMAT step to display all numeric values as zero. */
PROC FORMAT;
value allzero low-high='     0';
RUN;

/* Conditionally execute steps when no observations met criteria. */
%if &numobs=0 %then 
    %do;
        %let fmt = allzero.; /* Print all cell values as zeroes */
        %let str = ; /*No Cases in Subset - WHERE cannot be used */
    %end;

%else
    %do;
        %let fmt = 8.0;
        %let str = where ethcat = 1;
    %end;


PROC TABULATE data=enroll classdata=shelldata missing format=&fmt;
&str;
format racelbl racef. gender gndrf.;
class racelbl gender;
classlev racelbl gender;
keyword n pctn all;
tables (racelbl all='Racial Categories: Total of Hispanic or Latinos'),
gender='Sex/Gender'*N=' ' all='Total'*n='' / printmiss misstext='0'
box=[LABEL=' '];
title1 font=arial color=darkblue h=1.5 'Inclusion Enrollment Report';
title2 ' ';
title3 font=arial color=darkblue h=1' PART B. HISPANIC ENROLLMENT REPORT:
Number of Hispanic or Latinos Enrolled to Date (Cumulative)';
RUN;


%MEND TAB_WHERE;

%TAB_WHERE

Upvotes: 1

Joe
Joe

Reputation: 63434

I think the easiest way is to create a row in the data that has the missing value. You could look at the following paper for suggestions as to how to do this on a larger scale:

http://www.nesug.org/Proceedings/nesug11/pf/pf02.pdf

PROC FREQ has the SPARSE option, which gives you all possible combinations of all variables in the table (including missing ones), but it doesn't look like that gives you exactly what you need.

Upvotes: 1

Related Questions