J.Q
J.Q

Reputation: 1031

Determining the frequency of ONLY certain values in all variables of a data set

I'd like to get a frequency table that lists all variables, but only tells me the number of times "-2", "-1" and "M" appear in each variable.

Currently, when I run the following code:

proc freq data=mydata;
tables _ALL_
/list missing;

I get one table for each variable and all of its values (sometimes 100s). Can I just get tables with the three values I want, and everything else suppressed?

Upvotes: 0

Views: 1193

Answers (1)

Joe
Joe

Reputation: 63434

You can do this a number of ways.

First off, you probably want to do this to a dataset first to allow you to filter that dataset. I would use PROC TABULATE, but you can use PROC FREQ if you like it better.

*make up some data;
data mydata;
  call streaminit(132);
  array x[100];
  do _i = 1 to 50;
    do _t = 1 to dim(x);
      x[_t]= floor(rand('Uniform')*9-5);
    end;
    output;
  end;
  keep x:;
run;

ods _all_ close;   *close the 'visible' output types;
ods output onewayfreqs=outdata;  *output the onewayfreqs (one way frequency tables) to a dataset;
proc freq data=mydata;
  tables _all_/missing;
run;
ods output close;  *close the dataset;
ods preferences;   *open back up your default outputs;

Then filter it, and once you've done that print it however you want. Note in the PROC FREQ output, you get a column for each different variable - not super helpful. The F_ variables are the formatted values, which can then be combined using coalesce. I assume here they're all numeric variables - define f_val as character and use coalescec if there are any character variables or variables with character-ish formats applied to them.

data has_values;
  set outdata;
  f_val = coalesce(of f_:);
  keep table f_val frequency percent;
  if f_val in (0,-1,-2);
run;

The last line keeps only the 0,-1,-2.

Upvotes: 1

Related Questions