user3078807
user3078807

Reputation: 29

SAS Finding Median without PROC MEANS

I have a dataset (already sorted by the Blood Pressure variable)

Blood Pressure

87

99

99

109

111

112

117

119

121

123

139

143

145

151

165

198

I need to find the median without using proc means. Now For this data, there are 16 observations. The median is (119+121)/2 = 120.

How can I code so that I would always be able to find the median, regardless of how many observations there are. Code that would work for even number of observations and odd number of observations.

And of course, PROC means is not allowed.

Thank you.

Upvotes: 0

Views: 2490

Answers (2)

BellevueBob
BellevueBob

Reputation: 9618

Assuming you have a data set named HAVE sorted by the variable BP, you can try this:

data want(keep=median);
  if mod(nobs,2) = 0 then do; /* even number if records in data set */
     j = nobs / 2;
     set HAVE(keep=bp) point=j nobs=nobs;
     k = bp;  /* hold value in temp variable */
     j + 1;
     set HAVE(keep=bp) point=j nobs=nobs;
     median = (k + bp) / 2;
     end;
  else do;
     j = round( nobs / 2 );
     set HAVE(keep=bp) point=j nobs=nobs;
     median = bp;
     end;
   put median=; /* if all you want is to see the result */
   output;      /* if you want it in a new data set */
   stop;        /* stop required to prevent infinite loop */
run;

This is "old fashioned" code; I'm sure someone can show another solution using hash objects that might eliminate the requirement to sort the data first.

Upvotes: 0

DomPazz
DomPazz

Reputation: 12465

I use a FCMP function for this. This is a generic quantile function from my personal library. As the median is the 50%-tile, this will work.

options cmplib=work.fns;
data input;
input BP;
datalines;
87
99
99
109
111
112
117
119
121
123
139
143
145
151
165
198 
;run;

proc fcmp outlib=work.fns.fns;
function qtile_n(p, arr[*], n);
    alphap=1;
    betap=1;

    if n > 1 then do;
        m = alphap+p*(1-alphap-betap);
        i = floor(n*p+m);
        g = n*p + m - i;
        qp = (1-g)*arr[i] + g*arr[i+1];
    end;
    else 
        qp = arr[1];
    return(qp);
endsub;
quit;

proc sql noprint;
select count(*) into :n from input;
quit;

data _null_;
set input end=last;
array v[&n] _temporary_;

v[_n_] = bp;

if last then do;
    med = qtile_n(.5,v,&n);
    put med=;
end;
run;

Upvotes: 1

Related Questions