Reputation: 747
I would like to remove some outlier in the top and bottom 0.1%. PROC MEANS has the p99 option which only helps to remove the top 1%, not 0.1%. Is there another way to do so? I thought of PROC RANK but not sure if it would give the same result. my code is:
proc means data=input noprint; by date; output out=trunc(drop=_FREQ_ _TYPE_) p99(var1)=p99_var1 p99(var2)=p99_var2; run;
data input; merge input trunc; by date;
if var1 < p99_var1 and var2<p99_var2;run;
versus
proc rank data=input out=input percent;
by date;
var var1 var2;
ranks percentile1 percentile2;
run;
data input; set input;
where 0.001<percentile1<0.999 and 0.001<percentile2<0.999;run
I am aware that in the first method I use 99% (because I don't know how to do 99.9% with this method) but I use 99.9% in the second method. If I use 99% for the second method, which one would be a better way to do? and would the 2 yield the same result?
Upvotes: 1
Views: 562
Reputation: 49
Using the ties treatment and fractions options of proc rank you should have the flexibility you need for this problem.
Check the SAS documentation here.
Upvotes: -1
Reputation: 7602
proc means
only has access to certain default percentiles, however you can specify custom percentiles in proc univariate
proc univariate data=sashelp.prdsal3 noprint;
var actual;
output out=want pctlpre=P_ pctlpts=0.1,99.9;
run;
Upvotes: 2