bill999
bill999

Reputation: 2529

Count number of 0 values

Similar to here, I can count the number of missing observations:

data dataset;
  input a b c;
cards;
1 2 3
0 1 0
0 0 0
7 6 .
. 3 0
0 0 .
;
run;

proc means data=dataset NMISS N;
run;

But how can I also count the number of observations that are 0?

Upvotes: 2

Views: 6199

Answers (4)

DomPazz
DomPazz

Reputation: 12465

I add this as an additional answer. It requires you to have PROC IML.

This uses matrix manipulation to do the count.

(ds=0) -- creates a matrix of 0/1 values (false/true) of values = 0

[+,] -- sums the rows for all columns. If we have 0/1 values, then this is the number of value=0 for each column.

' -- operator is transpose.

|| -- merge matrices {0} || {1} = {0 1}

Then we just print the values.

proc iml;
use dataset;
read all var _num_ into ds[colname=names];
close dataset;
ds2 = ((ds=0)[+,])`;

n = nrow(ds);

ds2 = ds2 || repeat(n,ncol(ds),1);

cnames = {"N = 0", "Count"};
mattrib ds2 rowname=names colname=cnames;

print ds2;
quit;

Upvotes: 2

Shenglin Chen
Shenglin Chen

Reputation: 4554

there is levels options in proc freq you could use.

proc freq data=dataset levels;
table _numeric_;
run;

Upvotes: 1

DomPazz
DomPazz

Reputation: 12465

Easiest to use PROC SQL. You will have to use a UNION to replicate the MEANS output;

Each section of the first FROM counts the 0 values for each variable and UNION stacks them up.

The last section just counts the number of observations in DATASET.

proc sql;
select n0.Variable, 
       n0.N_0 label="Number 0", 
       n.count as N
 from (
   select "A" as Variable,
          count(a) as N_0
      from dataset
      where a=0
   UNION 
   select "B" as Variable,
          count(b) as N_0
      from dataset
      where b=0
   UNION
   select "C" as Variable,
          count(c) as N_0
      from dataset
      where c=0
) as n0,
(
    select count(*) as count
    from dataset
) as n;
quit;

Upvotes: 1

Joe
Joe

Reputation: 63424

If you want to count the number of observations that are 0, you'd want to use proc tabulate or proc freq, and do a frequency count.

If you have a lot of values and you just want "0/not 0", that's easy to do with a format.

data have;
  input a b c;
cards;
1 2 3
0 1 0
0 0 0
7 6 .
. 3 0
0 0 .
;
run;

proc format;
  value zerof
  0='Zero'
  .='Missing'
  other='Not Zero';
quit;

proc freq data=have;
  format _numeric_ zerof.;
  tables _numeric_/missing;
run;

Something along those lines. Obviously be careful about _numeric_ as that's all numeric variables and could get messy quickly if you have a lot of them...

Upvotes: 5

Related Questions