user190080
user190080

Reputation: 535

Produce custom table in SAS with a subsetted data set

I want to use SAS and eg. proc report to produce a custom table within my workflow.

Why: Prior, I used proc export (dbms=excel) and did some very basic stats by hand and copied pasted to an excel sheet to complete the report. Recently, I've started to use ODS excel to print all the relevant data to excel sheets but since ODS excel would always overwrite the whole excel workbook (and hence also the handcrafted stats) I now want to streamline the process.

The task itself is actually very straightforward. We have some information about IDs, age, and registration, so something like this:

data test;
input ID $   AGE  CENTER $;
datalines;
111   23   A
.     27   B
311   40   C
131   18   A
.     64   A
;
run;

The goal is to produce a table report which should look like this structure-wise:

                 ID   NO-ID  Total
Count            3     2      5
Age (mean)       27    45.5   34.4
Count by Center:
A                2     1      3
B                0     1      1
A                1     0      1

It seems, proc report only takes variables as columns but not a subsetted data set (ID NE .; ID =''). Of course I could just produce three reports with three subsetted data sets and print them all separately but I hope there is a way to put this in one table.

Is proc report the right tool for this and if so how should I proceed? Or is it better to use proc tabulate or proc template or...?

Upvotes: 0

Views: 124

Answers (1)

user190080
user190080

Reputation: 535

I found a way to achieve an almost match to what I wanted. First if all, I had to introduce a new variable vID (valid ID, 0 not valid, 1 valid) in the data set, like so:

data test;
input ID $   AGE  CENTER $;
if ID = '' then vID = 0;
    else vID = 1;
datalines;
111   23   A
.     27   B
311   40   C
131   18   A
.     64   A
;
run;

After this I was able to use proc tabulate as suggested by @Reeza in the comments to build a table which pretty much resembles what I initially aimed for:

proc tabulate data = test;   
class vID Center; 
var age;
keylabel N = 'Count';  
table N age*mean Center*N, vID ALL;
run;

enter image description here

Still, I wonder if there is a way without introducing the new variable at all and just use the SAS counters for missing and non-missing observations.

UPDATE:

@Reeza pointed out to use the proc format to assign a value to missing/non-missing ID data. In combination with the missing option (prints missing values) in proc tabulate this delivers the output without introducing a new variable:

proc format;
value $ id_fmt 
    ' ' = 'No-ID'
    other = 'ID'
    ;
run;

proc tabulate data = test missing;
format ID $id_fmt.;
class ID Center; 
var age;
keylabel N = 'Count';  
table N age*(mean median) Center*N, (ID=' ') ALL;
run;

Upvotes: 1

Related Questions