Reputation: 2583
suppose to have the following two datasets:
The first one contains a full list of dates per ID and an index that indicates if the patient has a comorbidity (1) or not (0). Then, there's another dataset that contains a subset of dates from the first dataset and a RefDate, 1 or 0. What I would like to do is the following:
for each ID:
if the comorbidity in DB1 = 1(!) before or = to the Start_Therapy where in DB2 RefDate = 1 then add an Index in DB1 equal to 1 otherwise the Index should be 0 regardless if in DB1 the comorbidity = 1. if the comorbidity in DB1 = 0 it should remain 0. Thank you in advance!
Desired output = DB3
data DB1; input ID Start_Therapy Comorbidity; format Start_Therapy date9.; cards; 0001 01OCT2015 1 0001 06DEC2016 1 0001 08NOV2020 1 0002 11JAN2014 0 0002 16JUN2014 0 0002 14MAY2015 1 0002 30JUN2015 1 0002 25FEB2016 0 0003 11SEP2012 0 0003 24AUG2014 1 0003 10DEC2014 1 0004 03JAN2014 0 0004 09FEB2014 1 0004 03AUG2015 1 0004 18MAY2016 0 ; data DB2; input ID Start_Therapy RefDate; format Start_Therapy date9.; cards; 0001 06DEC2016 1 0001 08NOV2020 0 0002 16JUN2014 1 0002 30JUN2015 0 0003 24AUG2014 1 0003 10DEC2014 0 0004 03AUG2015 1 ; data DB3; input ID Start_Therapy Comorbidity Index; format Start_Therapy date9.; cards; 0001 01OCT2015 1 1 0001 06DEC2016 1 1 0001 08NOV2020 1 0 0002 11JAN2014 0 0 0002 16JUN2014 0 0 0002 14MAY2015 1 0 0002 30JUN2015 1 0 0002 25FEB2016 0 0 0003 11SEP2012 0 0 0003 24AUG2014 1 1 0003 10DEC2014 1 0 0004 03JAN2014 0 0 0004 09FEB2014 1 1 0004 03AUG2015 1 1 0004 18MAY2016 0 0 ;
Upvotes: 0
Views: 46
Reputation: 2776
I used PROC SQL to join db1 with a subset of db2 (keeping only rows with RefDate = 1).
The result is different from your desired output as rows 4 and 5 (11JAN2014 and 16JUN2014 for ID2) have index=0 in my output, but index = 1 in the required output. My understanding is that because they have no comorbity in db1 before these dates, they should actually be 0.
proc sql;
create table db3 as
select db1.*, coalesce(b.refdate, 0) as Index
from db1 left join
(select * from db2 where refdate = 1) b on
(db1.id = b.id and db1.start_therapy <= b.start_therapy)
order by id, start_therapy
;
quit;
Obs ID Start_Therapy Comorbidity Index
1 1 01OCT2015 1 1
2 1 06DEC2016 1 1
3 1 08NOV2020 1 0
4 2 11JAN2014 0 0
5 2 16JUN2014 0 0
6 2 14MAY2015 1 0
7 2 30JUN2015 1 0
8 2 25FEB2016 0 0
9 3 11SEP2012 0 0
10 3 24AUG2014 1 1
11 3 10DEC2014 1 0
12 4 03JAN2014 0 0
13 4 09FEB2014 1 1
14 4 03AUG2015 1 1
15 4 18MAY2016 0 0
Upvotes: 0
Reputation: 27508
Some of your example seems flawed for ID=2. The dates 11JAN2014 and 16JUN2014 have como 0, so wouldn't INDEX be 0 in the result.
Use an existential sub-query to flag your criteria.
Example:
proc sql;
create table want as
select *,
case
when comorbidity = 0 then 0
else exists (
select * from DB2
where refdate=1
and DB2.ID = DB1.ID
and DB2.start_therapy >= DB1.start_therapy
)
end as index
from DB1
;
Upvotes: 3