Reputation: 2529
Say that I have this MWE data:
data v;
input var1 $ var2 var3 $;
datalines;
cat 3 yes
sheep 2 no
sheep 3 maybe
pig 3 maybe
goat 3 maybe
cat 2 no
pig 1 no
cat 2 no
pig 1 no
goat 3 no
cat 3 no
cat 2 yes
cat 1 yes
sheep 3 no
cat 2 no
cat 1 maybe
;
run;
I use proc tabulate to compute the number of observations for each value. I do this for each of the variables:
proc tabulate data=v;
class var1;
table (var1='' all="Total"),(N pctn);
quit;
proc tabulate data=v;
class var2;
table (var2='' all="Total"),(N pctn);
quit;
proc tabulate data=v;
class var3;
table (var3='' all="Total"),(N pctn);
quit;
I get output that looks as follows:
N PctN
cat 8 50.00
goat 2 12.50
pig 3 18.75
sheep 3 18.75
Total 16 100.00
N PctN
1 4 25.00
2 5 31.25
3 7 43.75
Total 16 100.00
N PctN
maybe 4 25.00
no 9 56.25
yes 3 18.75
Total 16 100.00
My question is: How can I export this into Excel in the following format?:
Name Cat 1 N1 N1% Cat 2 N2 N2% Cat 3 N3 N3% Cat 4 N4 N4% Missing % Total Total%
var1 cat 8 50 goat 2 12.5 pig 3 18.75 sheep 3 18.75 0 16 100
var2 1 4 25 2 5 31.25 3 7 43.75 0 16 100
var3 maybe 4 25 no 9 56.25 yes 3 18.75 0 16 100
In other words, I want each different variable to have its own line. Each value of the variable will appear in this line, with the number of observations and % of total observations. The final three columns are a bonus, but not required: the % and number of missing observations and the total number of values for the variable. How can I do this?
Note that I am very novice at SAS. Any improvements in the code are also welcome, such as how to loop or condense the code to produce the tables.
Upvotes: 0
Views: 1575
Reputation: 27508
The desired data form is extremely messy and hard to consume as the number of variables and number of distinct values thereof increases.
These processing steps can be performed to achieve the output structure:
Example
Data has a fourth variable that has some missing values.
data have;
input var1 $ var2 var3 $ var4;
datalines;
cat 3 yes .
sheep 2 no .
sheep 3 maybe .
pig 3 maybe .
goat 3 maybe 1
cat 2 no 1
pig 1 no 1
cat 2 no 1
pig 1 no 1
goat 3 no 1
cat 3 no 1
cat 2 yes 1
cat 1 yes 1
sheep 3 no 1
cat 2 no 2
cat 1 maybe 1
;
run;
options missing = ' ';
proc transpose data=have_v out=vector1(index=(_name_));
by rowid;
var var1 var2 var3 var4;
run;
proc freq noprint data=vector1;
by _name_;
table col1 / missing out=freqs;
run;
options missing = '.';
data freqs_0;
set freqs;
by _name_;
retain nomiss;
if first._name_ then nomiss = not missing(col1);
if first._name_ then seq=1; else seq+1;
seqc = cats(seq);
if first._name_ and missing(col1) then do;
seqc = 'missing';
seq = 0;
end;
length widename $32;
if seqc ne 'missing' then do;
widename = cats("cat_",seqc);
widevalue = col1;
output;
end;
widename = cats("cat_",seqc,'_COUNT');
widevalue = COUNT;
output;
widename = cats("cat_",seqc,'_PERCENT');
widevalue = PERCENT;
output;
if last._name_ and nomiss then do;
seqc = 'missing';
widename = cats("cat_",seqc,'_COUNT');
widevalue = 0;
output;
widename = cats("cat_",seqc,'_PERCENT');
widevalue = 0;
output;
end;
keep _name_ widename widevalue;
run;
proc transpose data=freqs_0 out=wide;
by _name_;
id widename;
var widevalue;
run;
Upvotes: 1