Reputation: 11
I'm looking for a macro or something in SAS that can help me in isolating the outliers from a dataset. I define an outlier as: Upperbound: Q3+1.5(IQR) Lowerbound: Q1-1.5(IQR). I have the following SAS code:
title 'Fall 2015';
proc univariate data = fall2015 freq;
var enrollment_count;
histogram enrollment_count / vscale = percent vaxis = 0 to 50 by 5 midpoints = 0 to 300 by 5;
inset n mean std max min range / position = ne;
run;
I would like to get rid of the outliers from fall2015 dataset. I found some macros, but no luck in working the macro. Several have a class variable which I don't have. Any ideas how to separate my data?
Upvotes: 0
Views: 4114
Reputation: 21294
Here's a macro I wrote a while ago to do this, under slightly different rules. I've modified it to meet your criteria (1.5).
Call macro using call execute and boundaries set, using the output from step 1.
*Calculate IQR and first/third quartiles;
proc means data=sashelp.class stackods n qrange p25 p75;
var weight height;
ods output summary=ranges;
run;
*create data with outliers to check;
data capped;
set sashelp.class;
if name='Alfred' then weight=220;
if name='Jane' then height=-30;
run;
*macro to cap outliers;
%macro cap(dset=,var=, lower=, upper=);
data &dset;
set &dset;
if &var>&upper then &var=&upper;
if &var<&lower then &var=&lower;
run;
%mend;
*create cutoffs and execute macro for each variable;
data cutoffs;
set ranges;
lower=p25-1.5*qrange;
upper=p75+1.5*qrange;
string = catt('%cap(dset=capped, var=', variable, ", lower=", lower, ", upper=", upper ,");");
call execute(string);
run;
Upvotes: 0