Reputation: 13
I have data that looks like -
data abc;
input ID $ drug $ episode start_date date9. end_date date9.;
format start_date end_date date9.;
informat start_date end_date date9.;
datalines ;
1 A 1 01Jan2012 30Mar2012
1 A 2 01May2012 03Jul2012
1 A 3 28Sep2012 28Oct2012
1 A 4 01Nov2012 30Dec2012
1 B 1 01Apr2012 10May2012
1 B 2 02Nov2012 28Dec2012
1 B 3 01Jan2012 30Mar2012
1 C 1 01Jul2012 02Aug2012
;
run;
Here we have subjects and the the drugs they take. A new episode of one drug means that the person discontinued.
If the start date (start date of 1st episode) of second drug consumed , lies in between the episodes of first drug , then we will ignore all the further episodes of 1st drug.
Eg. here 1 april (start date of drug B) lies after the first episode of drug A, so episode 2,3,4 of drug A would be deleted.
Similarly the start date for drug C lies after the end date of episode 1 for drug B then episode 2 of drug B would be deleted.
The maximum number of episodes a subject can have is 15.
The resultant dataset should look like -
ID Drug Episode start_date end_date
1 A 1 1-Jan 30-Mar
1 B 1 1-Apr 10-May
1 C 1 1-Jul 2-Aug
Upvotes: 0
Views: 63
Reputation: 4937
How about this? I added another ID to the example data for demonstration.
data abc;
input ID $ drug $ episode start_date :date9. end_date :date9.;
format start_date end_date date9.;
datalines ;
1 A 1 01Jan2012 30Mar2012
1 A 2 01May2012 03Jul2012
1 A 3 28Sep2012 28Oct2012
1 A 4 01Nov2012 30Dec2012
1 B 1 01Apr2012 10May2012
1 B 2 02Nov2012 28Dec2012
1 B 3 01Jan2012 30Mar2012
1 C 1 01Jul2012 02Aug2012
2 A 1 01Jan2012 30Mar2012
2 A 2 01May2012 03Jul2012
2 A 3 28Sep2012 28Oct2012
2 A 4 01Nov2012 30Dec2012
2 B 1 01Apr2012 10May2012
2 B 2 02Nov2012 28Dec2012
2 B 3 01Jan2012 30Mar2012
2 C 1 01Jul2012 02Aug2012
;
run;
data want;
format ID drug episode start_date end_date;
keep ID drug episode start_date end_date;
declare hash h ();
h.definekey ('ID', 'd');
h.definedata ('_start_date');
h.definedone ();
do until (lr1);
set abc (rename= (start_date = _start_date)) end=lr1;
by ID drug;
if first.ID then d = 0;
if first.drug then d + 1;
if episode = 1 then h.add();
end;
do until (lr2);
set abc end=lr2;
by ID drug;
if first.ID then d = 0;
if first.drug then do;
d + 1; flag = 0;
end;
rc = h.find(key : ID, key : d+1);
if start_date > _start_date then flag=1;
if flag = 0 then output;
end;
retain flag;
run;
Result:
ID drug episode start_date end_date
1 A 1 01JAN2012 30MAR2012
1 B 1 01APR2012 10MAY2012
1 C 1 01JUL2012 02AUG2012
2 A 1 01JAN2012 30MAR2012
2 B 1 01APR2012 10MAY2012
2 C 1 01JUL2012 02AUG2012
Upvotes: 0