Reputation: 13
I would like to pick values within an ID variable which are 10% of each other. For example, my data looks like this:
ID Var1
001 100
001 109
001 200
001 210
001 220
001 300
001 310
002 500
002 510
My desired output is some way to flag this so that I can separate this into groups:
ID Var1 Flag
001 100 1
001 109 1
001 200 2
001 210 2
001 220 2
001 300 3
001 310 3
002 500 1
002 510 1
I tried using a lag function and flagging data but it only flags the second row in a pair; I am not able to pull both the values in a pair that are within 10 percent of each other.
Upvotes: 0
Views: 774
Reputation: 21264
Here's how to flag if the difference between records are within 10% of each other. You can determine the 10% ratio by dividing the numbers, subtracting 1 and taking the absolute value. This assumes your data is sorted by ID and ascending var1 value.
data want;
set have;
by ID;
retain group;
lagv1=lag(var1);
if first.id then do;
lagv1=.;
group=1;
end;
else do;
diff = abs(var1/lagv1-1);
if diff >0.1 then group+1;
end;
run;
Upvotes: 1