Reputation: 2614
I have a data set with many variables - many of which are character valued. I have the following code to count the number of missing values for each variable:
proc format;
value $missfmt ' '='Missing' other='Not Missing';
value missfmt . ='Missing' other='Not Missing';
run;
proc freq data=dataname;
format _CHAR_ $missfmt.; /* apply format for the duration of this PROC */
tables _CHAR_ / missing missprint nocum nopercent;
format _NUMERIC_ missfmt.;
tables _NUMERIC_ / missing missprint nocum nopercent;
run;
However, this results in a huge output (300 page pdf if I print to pdf) with 90% of the variables having no missing values. How do I tell PROC FREQ to only display the tables which have missing values?
Upvotes: 3
Views: 7307
Reputation: 11
This one removes all blank columns:
%macro removeblanks(dataset,output);
/* create dataset that holds variables with missing values */
ods select nlevels;
ods output nlevels=miss_vars (where=(nmisslevels>0 and nnonmisslevels=0));
ods noresults;
proc freq data=&dataset. nlevels;
run;
/* store names in a macro variable */
proc sql noprint;
select tablevar into :missvar separated by ' '
from miss_vars;
quit;
data &output.;
set &dataset.(drop=&missvar.);
run;
%mend removeblanks;`
Upvotes: 1
Reputation: 7602
You can identify which variables have a missing value from the NLEVELS option in PROC FREQ. So my process would be to create a dataset that just held the variables with missing values, then store them in a macro variable so the following PROC FREQ can be run against them only. Here is the code to do that.
/* set up dummy dataset */
data have;
set sashelp.class;
if _n_ in (10,13) then call missing(age,sex);
run;
/* create dataset that holds variables with missing values */
ods select nlevels;
ods output nlevels=miss_vars (where=(nmisslevels>0));
ods noresults;
proc freq data=have nlevels;
run;
ods results;
/* store names in a macro variable */
proc sql noprint;
select tablevar into :missvar separated by ' '
from miss_vars;
quit;
proc format;
value $missfmt ' '='Missing' other='Not Missing';
value missfmt . ='Missing' other='Not Missing';
run;
proc freq data=have (keep=&missvar.);
format _CHAR_ $missfmt.; /* apply format for the duration of this PROC */
tables _CHAR_ / missing missprint nocum nopercent;
format _NUMERIC_ missfmt.;
tables _NUMERIC_ / missing missprint nocum nopercent;
run;
Upvotes: 4