Reputation: 21
For each ID, I want to drop all observations that come after a dummy variable takes the value 1. An example would be:
The data I have:
ID TIME DUMMY
1 1 0
1 2 0
1 3 1
1 4 0
2 1 0
2 2 0
2 3 1
3 1 1
3 2 0
3 3 0
The output I want:
ID TIME DUMMY
1 1 0
1 2 0
1 3 1
2 1 0
2 2 0
2 3 1
3 1 1
Upvotes: 1
Views: 3744
Reputation: 367
Using a dow-loop seems more elegant than retain and lag. Read more about this programming structure here: http://analytics.ncsu.edu/sesug/2011/SS01.Dorfman.pdf It's very useful.
data want(drop=printit);
if 0 then set have;
printit = last.id;
do until(last.id or dummy);
set have;
by id;
if printit then output;
end;
run;
The line before the do
statement is just to keep the columns in the original order.
Upvotes: 1
Reputation: 9569
This is one way of doing it:
data want(drop = a);
set have;
by id;
retain a;
if first.id then a = 1;
if a = 1 then output;
if dummy = 1 then a = 0;
run;
Upvotes: 2
Reputation: 1078
I will use the help of the OUTPUT
, RETAIN
and RETURN
statements to override the DATA
step loop. The RETURN
statement forces the DATA
step to go to the next iteration immediately. This should drop all the observations between the DUMMYs = 1 observations:
data want;
set test;
drop count;
if DUMMY = 1 then do;
retain count;
count = 1;
output;
return;
end;
if count = 1 and DUMMY ne 1 then do;
retain count;
delete;
return;
end;
output;
run;
OUTPUT:
ID TIME DUMMY
1 1 0
1 2 0
1 3 1
2 3 1
3 1 1
Upvotes: 0