In SAS how to display unique counts when applying proc tabulate.

Question

I am looking for a way to produce a frequency table displaying number of unique values for variable ID per uniqe value of variable Subclass.

I would like to order the results by variable Class.

Preferably I would like to display the number of unique values for ID per Subclass as a fraction of n for ID. In the want-example below this values is displayed under %totalID.

In addition I would like to display the number of unique values for ID per Subclass as a fraction of the sum of unique ID values found within each Class. In the want-example below this values is displayed under %withinclassID.

Have:

ID        Class        Subclass         
-------------------------------   
ID1        1           1a          
ID1        1           1b           
ID1        1           1c           
ID1        2           2a

ID2        1           1a           
ID2        1           1b           
ID2        2           2a           
ID2        2           2b              
ID2        3           3a

ID3        1           1a                      
ID3        1           1d 
ID3        2           2a
ID3        3           3a           
ID3        3           3b

Want:

                         Unique number   
Class       Subclass       of IDs        %totalID        %withinclassID
--------------------------------------------------------------------
1            
             1a             3             100.0             50.00 
             1b             2             66.67             33.33 
             1c             1             33.33             16.67
             SUM            6
2            
             2a             3             100.0             75.00
             2b             1             33.33             25.00
             SUM            4
3            
             3a             2             66.67             66.67
             3b             1             33.33             33.33
             SUM            3

My initial approach was to perform a PROC FREQ on NLEVELS producing a frequency table for number of unique IDs per subclass. Here however I lose information on class. I therefore cannot order the results by class.

My second approach involved using PROC TABULATE. I however cannot produce any percentage calculations based on unique counts in such a table.

Is there a direct way to tabulate the frequencies of one variable according to a second variable, grouped by a third variable--displaying overall and within group percentages?

Reeza · Accepted Answer

You can do a double proc freq or SQL.

/*This demonstrates how to count the number of unique occurences of a variable across groups. It uses the SASHELP.CARS dataset which is available with any SAS installation. The objective is to determine the number of unique car makers by origin/

Note: The SQL solution can be off if you have a large data set and these are not the only two ways to calculate distinct counts. If you're dealing with a large data set other methods may be appropriate.*/

*Count distinct IDs;
proc sql;
create table distinct_sql as
select origin, count(distinct make) as n_make
from sashelp.cars
group by origin;
quit;

*Double PROC FREQ;
proc freq data=sashelp.cars noprint;
table origin * make / out=origin_make;
run;

proc freq data=origin_make noprint;
table origin / out= distinct_freq outpct;
run;

title 'PROC FREQ';
proc print data=distinct_freq;
run;
title 'PROC SQL';
proc print data=distinct_sql;
run;

In SAS how to display unique counts when applying proc tabulate.

Answers (2)

Related Questions