Reputation: 2583
suppose to have two datasets:
data DB1;
input ID :$20.Reference_date :date9. Discharge :date9.;
format Reference_date Discharge date9.;
cards;
0001 14JUN2017 19JUN2017
0002 12MAR2016 17MAR2016
0003 01MAY2016 05MAY2016
0004 19MAR2017 22MAR2017
0005 10MAR2017 22MAR2017
0007 10OCT2015 14OCT2015
;
data DB2;
input ID :$20.Discharge_new :date9.;
format Discharge_new date9.;
cards;
0001 21JUN2017
0002 13MAR2016
0003 04MAY2016
0004 19MAR2017
0005 22MAR2017
0006 27JUN2022
0007 18OCT2015
;
Is there a way to check if, for each ID, the discharge_new from DB2 is in the interval of Reference_date-Discharge (>=; <=) of DB1? If yes add a Flag = 1 (otherwise Flag = 0) to DB2 to get DB3. Note that there are some IDs in DB2 not present in DB1 that should remain missing. I don't know how to deal with dates comparison from two different datasets.
Thank you in advance.
Desired output:
data DB3;
input ID :$20.Discharge_new :date9. Flag :$20.;
format Discharge_new date9.;
cards;
0001 21JUN2017 0
0002 13MAR2016 1
0003 04MAY2016 1
0004 19MAR2017 1
0005 22MAR2017 1
0006 27JUN2022 .
0007 18OCT2015 0
;
Upvotes: 0
Views: 34
Reputation: 2776
This assumes your data are sorted by ID and there are no duplicate IDs (both of which are true in your sample data).
Use the (in = )
data set option on the merge to create a check that the row is present in the relevant data set.
data db3;
merge db1 (in = indb1) db2 (in = indb2);
by id;
* only keep rows from db2;
if indb2;
* only assign a flag of 1 or 0 if there is a match in db1;
if indb1 then do;
if discharge_new <= reference_date <= discharge then flag = 1;
else flag = 0;
end;
keep id discharge_new flag;
run;
Upvotes: 2
Reputation: 21294
SQL is always good for these types of comparisons.
proc sql;
create table want as
select t2.id,
t2.discharge_new,
case when (not missing(t1.id) and discharge_new between reference_date and discharge) then 1
when not missing(t1.id) then 0
else .
end as flag
from db1 as t1
right join db2 as t2
on t1.id = t2.id;
quit;
Upvotes: 1