Reputation: 59
I feel like I am making this more complicated than it should be. I have a sample dataset below with an ID column and a Counter column. The counter column resets and I would like to create a dataset containing only the rows where the counter column is the max value before it resets again. My dataset also has thousands of ID's that I would need to do this for.
data test;
infile datalines delimiter=",";
informat ID $3.
TCOUNT 10.;
input ID $ TCOUNT $ ;
datalines;
123,1
123,2
123,3
123,4
123,1
123,2
123,3
123,1
123,2
;
run;
and my desired output in a new table would look like...
ID TCOUNT
123 4
123 3
123 2
Upvotes: 0
Views: 101
Reputation: 51566
It might be easiest/clearest to first assign a label to each of the non-decreasing TCOUNT blocks of observations.
data groups;
set test;
by id ;
if first.id then group=0;
if first.id or tcount<lag(tcount) then group+1;
run;
Then it is a simple matter to find the last observation in each group.
data want;
set groups;
by id group;
if last.group;
run;
Upvotes: 1