Reputation: 535
I want to use SAS
and eg. proc report
to produce a custom table within my workflow.
Why: Prior, I used proc export (dbms=excel)
and did some very basic stats by hand and copied pasted to an excel sheet to complete the report. Recently, I've started to use ODS excel to print all the relevant data to excel sheets but since ODS excel would always overwrite the whole excel workbook (and hence also the handcrafted stats) I now want to streamline the process.
The task itself is actually very straightforward. We have some information about IDs, age, and registration, so something like this:
data test;
input ID $ AGE CENTER $;
datalines;
111 23 A
. 27 B
311 40 C
131 18 A
. 64 A
;
run;
The goal is to produce a table report which should look like this structure-wise:
ID NO-ID Total
Count 3 2 5
Age (mean) 27 45.5 34.4
Count by Center:
A 2 1 3
B 0 1 1
A 1 0 1
It seems, proc report
only takes variables as columns but not a subsetted data set (ID NE .; ID =''
). Of course I could just produce three reports with three subsetted data sets and print them all separately but I hope there is a way to put this in one table.
Is proc report
the right tool for this and if so how should I proceed? Or is it better to use proc tabulate
or proc template
or...?
Upvotes: 0
Views: 124
Reputation: 535
I found a way to achieve an almost match to what I wanted. First if all, I had to introduce a new variable vID (valid ID, 0 not valid, 1 valid)
in the data set, like so:
data test;
input ID $ AGE CENTER $;
if ID = '' then vID = 0;
else vID = 1;
datalines;
111 23 A
. 27 B
311 40 C
131 18 A
. 64 A
;
run;
After this I was able to use proc tabulate
as suggested by @Reeza in the comments to build a table which pretty much resembles what I initially aimed for:
proc tabulate data = test;
class vID Center;
var age;
keylabel N = 'Count';
table N age*mean Center*N, vID ALL;
run;
Still, I wonder if there is a way without introducing the new variable at all and just use the SAS
counters for missing and non-missing observations.
UPDATE:
@Reeza pointed out to use the proc format
to assign a value to missing/non-missing ID data. In combination with the missing option (prints missing values) in proc tabulate
this delivers the output without introducing a new variable:
proc format;
value $ id_fmt
' ' = 'No-ID'
other = 'ID'
;
run;
proc tabulate data = test missing;
format ID $id_fmt.;
class ID Center;
var age;
keylabel N = 'Count';
table N age*(mean median) Center*N, (ID=' ') ALL;
run;
Upvotes: 1