NewUsr_stat
NewUsr_stat

Reputation: 2583

Add a new index column based on dates comparison

suppose to have the following two datasets:

The first one contains a full list of dates per ID and an index that indicates if the patient has a comorbidity (1) or not (0). Then, there's another dataset that contains a subset of dates from the first dataset and a RefDate, 1 or 0. What I would like to do is the following:

for each ID:

if the comorbidity in DB1 = 1(!) before or = to the Start_Therapy where in DB2 RefDate = 1 then add an Index in DB1 equal to 1 otherwise the Index should be 0 regardless if in DB1 the comorbidity = 1. if the comorbidity in DB1 = 0 it should remain 0. Thank you in advance!

Desired output = DB3

   data DB1;
     input ID Start_Therapy Comorbidity;
     format Start_Therapy date9.;
   cards;
   0001 01OCT2015  1
   0001 06DEC2016  1
   0001 08NOV2020  1
   0002 11JAN2014  0
   0002 16JUN2014  0
   0002 14MAY2015  1
   0002 30JUN2015  1
   0002 25FEB2016  0
   0003 11SEP2012  0
   0003 24AUG2014  1
   0003 10DEC2014  1
   0004 03JAN2014  0
   0004 09FEB2014  1
   0004 03AUG2015  1
   0004 18MAY2016  0
   ;
   
   data DB2;
     input ID Start_Therapy  RefDate;
     format Start_Therapy date9.;
   cards;
   0001 06DEC2016  1
   0001 08NOV2020  0
   0002 16JUN2014  1
   0002 30JUN2015  0
   0003 24AUG2014  1
   0003 10DEC2014  0
   0004 03AUG2015  1
   ;
   
   data DB3;
     input ID  Start_Therapy  Comorbidity Index;
     format Start_Therapy date9.;
   cards;
   0001 01OCT2015  1  1
   0001 06DEC2016  1  1
   0001 08NOV2020  1  0
   0002 11JAN2014  0  0
   0002 16JUN2014  0  0
   0002 14MAY2015  1  0
   0002 30JUN2015  1  0
   0002 25FEB2016  0  0
   0003 11SEP2012  0  0
   0003 24AUG2014  1  1
   0003 10DEC2014  1  0
   0004 03JAN2014  0  0
   0004 09FEB2014  1  1
   0004 03AUG2015  1  1
   0004 18MAY2016  0  0
   ;

Upvotes: 0

Views: 46

Answers (2)

shaun_m
shaun_m

Reputation: 2776

I used PROC SQL to join db1 with a subset of db2 (keeping only rows with RefDate = 1).

The result is different from your desired output as rows 4 and 5 (11JAN2014 and 16JUN2014 for ID2) have index=0 in my output, but index = 1 in the required output. My understanding is that because they have no comorbity in db1 before these dates, they should actually be 0.

 proc sql;
     create table db3 as
     select db1.*, coalesce(b.refdate, 0) as Index
     from db1 left join 
        (select * from db2 where refdate = 1) b on 
            (db1.id = b.id and db1.start_therapy <= b.start_therapy)
     order by id, start_therapy
     ;
quit;


Obs    ID    Start_Therapy    Comorbidity    Index                       
                                                                        
 1     1      01OCT2015           1           1                         
 2     1      06DEC2016           1           1                         
 3     1      08NOV2020           1           0                         
 4     2      11JAN2014           0           0                         
 5     2      16JUN2014           0           0                         
 6     2      14MAY2015           1           0                         
 7     2      30JUN2015           1           0                         
 8     2      25FEB2016           0           0                         
 9     3      11SEP2012           0           0                         
10     3      24AUG2014           1           1                         
11     3      10DEC2014           1           0                         
12     4      03JAN2014           0           0                         
13     4      09FEB2014           1           1                         
14     4      03AUG2015           1           1                         
15     4      18MAY2016           0           0                        

Upvotes: 0

Richard
Richard

Reputation: 27508

Some of your example seems flawed for ID=2. The dates 11JAN2014 and 16JUN2014 have como 0, so wouldn't INDEX be 0 in the result.

Use an existential sub-query to flag your criteria.

Example:

proc sql;
  create table want as
  select *, 
  case 
    when comorbidity = 0 then 0
    else exists ( 
      select * from DB2 
      where refdate=1
        and DB2.ID = DB1.ID
        and DB2.start_therapy >= DB1.start_therapy
    )
  end as index
  from DB1
  ;

Upvotes: 3

Related Questions