Dreamer
Dreamer

Reputation: 13

Remove duplicates inside a sequence of records in a group with SAS

Is it possible to remove duplicated records in sequence inside a specific group and output only last of them (based od date) with 4GL (SAS)? I have data like:

data example;
input obs id dt value WANT_TO_SELECT;
cards;
1 10 1 500 0
2 10 2 750 1
3 10 3 750 1
4 10 4 750 0
5 10 5 500 0
6 20 1 150 1
7 20 2 150 0
8 20 3 370 0
9 20 4 150 0
;
run;

As You see for id=10 I would like to have only one (last) record with value 750, because there is one after the other while value 500 can be twice because they are separated. I was trying use last/first but I am not sure how to sort the data.

Upvotes: 1

Views: 306

Answers (1)

Tom
Tom

Reputation: 51566

Looks like a use case for the NOTSORTED keyword of the BY statement. This will let you use VALUE as a BY variable even though the data is not actually sorted by VALUE. That way the LAST.VALUE flag can be used.

data want;
  set example;
  by id value notsorted;
  if last.value;
run;

Results:

                                   WANT_TO_
Obs    obs    id    dt    value     SELECT

 1      1     10     1     500         0
 2      4     10     4     750         0
 3      5     10     5     500         0
 4      7     20     2     150         0
 5      8     20     3     370         0
 6      9     20     4     150         0

Upvotes: 2

Related Questions