btiger
btiger

Reputation: 35

SAS - remove lines from a file based on in-between period from another file

I have the following two datasets. I want to delete the lines from A with dates that falls within the in-out period in file B.

File A

SEQ ID  date    var1
1   A12 03JAN04 242
2   A12 01FEB06 356
3   A12 06JAN08 325
4   A12 28DEC09 123
5   B34 06MAY03 985
6   B34 13JUN03 198
7   B34 10MAY05 241
8   C56 09NOV10 155
9   C56 19OCT13 352

File B

ID  date_in date_out
A12 01JAN04 04JAN04
A12 05FEB08 08FEB08
B34 03MAY03 06MAY03
B34 09MAY05 19MAY05
C56 12JUL12 18JUL12

For example, for ID=A12, his first obs falls within 01JAN04-04JAN04, thus I delete that obs, but keep the rest.

Essentially, I should get the following results

SEQ ID  date    var1
2   A12 01FEB06 356
3   A12 06JAN08 325
4   A12 28DEC09 123
6   B34 13JUN03 198
8   C56 09NOV10 155
9   C56 19OCT13 352

Both file A and B has multiple records per ID, and the between in-out period in file B is not alway constant.

I think in R, I could probably write a loop through lines in A, subset out a list from B that matches the ID, then loop through that list of period from B see if date in A falls within the in-between period.

But I need to this in SAS, so I'm really clueless about where to start. How do I even loop through line-by-line in SAS? eg. should I write a macro that loops through both file some how? or use SQL somehow?

I would greatly appreciate some advice on how to approach this problem. Thanks!

Upvotes: 3

Views: 113

Answers (1)

Dominic Comtois
Dominic Comtois

Reputation: 10411

PROC SQL easily takes care of that...

proc sql;
  create table bad_seqs as
    select A.SEQ
      from A
      left join B
             on A.ID = B.ID
      where A.date between B.date_in and B.date_out;

  delete from A where SEQ in (select SEQ from bad_seqs);
quit;

Upvotes: 2

Related Questions