Reputation: 97
I read a lot about the first. and last. function and basic calculations in SAS, though I want to circumvent the following problem in the datastep (if possible):
I need to flag each observation that exceeds the 25th percentile in each direction. I.e. I want to calculate outliers and give them either a 1 or 0 (outlier or not). The problem I have is that I want to do it for a group/class of observations in the dataset.
Group Value OutlierFlag
a 1 1
a 10 0
a 11 0
a 400 1
b 2 0
b 2 0
b 500 1
To complicate/advance: I need to add a time-grid which means I need to sum all observations each minute and write down the value onto a kind of grid (the current observations are not in specified time intervals). I already produced the grid (minute steps). But how can I sum up observations and include them into the grid datastep in each minute observations? I am sorry if this is too blurry but maybe one of you knows how to do that or has an idea. I am very thankful!
Best!
EDIT:
Alright, I tested:
proc means data = MM.Data median P25 P75;
class Security;
ods output Summary=mm.Data_median;
var price spread; run;
data mm.data; set mm.Data_median;
run;
That basically gives me the Proc Means Output. But I want the original dataet filled with the p25 and p75 variables. Then I tried:
proc sql;
create table mm.newData as select *, sum(spread) as sumspread
from mm.Data
group by RIC; quit;
But it firstly groups it again and then there is no P25 function (I just entered sum for trial reasons).
Upvotes: 1
Views: 544
Reputation: 21274
Merge in using BY Group
proc means data=sashelp.class nway stackods median p25 p75;
class sex;
var weight;
ods output summary=stats;
run;
proc sort data=sashelp.class out=class;
by sex;
data want;
merge class stats (keep=sex median p25 p75);
by sex;
flag_p75=ifn(weight>p75, 1, 0);
run;
Upvotes: 3