Reputation: 73
I am stuck in transforming the data table from one format to another format using the SAS Programming function. The structure of the Table is given as below:
id Date Time assigned_pat_loc prior_pat_loc Activity
1 May/31/11 8:00 EIAB^EIAB^6 Admission
1 May/31/11 9:00 8w^201 EIAB^EIAB^6 Transfer to 8w
1 Jun/8/11 15:00 8w^201 Discharge
2 May/31/11 5:00 EIAB^EIAB^4 Admission
2 May/31/11 7:00 10E^45 EIAB^EIAB^4 Transfer to 10E
2 Jun/1/11 1:00 8w^201 10E^45 Transfer to 8w
2 Jun/1/11 8:00 8w^201 Discharge
3 May/31/11 9:00 EIAB^EIAB^2 Admission
3 Jun/1/11 9:00 8w^201 EIAB^EIAB^2 Transfer to 8w
3 Jun/5/11 9:00 8w^201 Discharge
4 May/31/11 9:00 EIAB^EIAB^9 Admission
4 May/31/11 7:00 10E^45 EIAB^EIAB^9 Transfer to 10E
4 Jun/1/11 8:00 10E^45 Death
I want to transform it into the following table.
Here are the details of the variables.
Please anyone guide me how it could be done?
Upvotes: 0
Views: 457
Reputation: 27508
When dealing with ranges in which there is a possibility of an unexpected overlap case you can enumerate over the range and perform simpler logic for finding shared time/unit/room.
Example:
data have;
length id date time 8 loc ploc $20 activity $10;
input
id Date& date11. Time time5. loc ploc Activity;
format date date9. time time5.;
datetime = dhms (date,0,0,0) + time;
length unit room bed punit proom pbed $4;
unit = scan(loc,1,'^');
room = scan(loc,2,'^');
bed = scan(loc,3,'^');
punit = scan(ploc,1,'^');
proom = scan(ploc,2,'^');
pbed = scan(ploc,3,'^');
drop loc ploc;
datalines;
1 31-May-2011 8:00 EIAB^EIAB^6 . Admission
1 31-May-2011 9:00 8w^201 EIAB^EIAB^6 Transfer to 8w
1 8-Jun-2011 15:00 8w^201 . Discharge
2 31-May-2011 5:00 EIAB^EIAB^4 . Admission
2 31-May-2011 7:00 10E^45 EIAB^EIAB^4 Transfer to 10E
2 1-Jun-2011 1:00 8w^201 10E^45 Transfer to 8w
2 1-Jun-2011 8:00 8w^201 . Discharge
3 31-May-2011 9:00 EIAB^EIAB^2 . Admission
3 1-Jun-2011 9:00 8w^201 EIAB^EIAB^2 Transfer to 8w
3 5-Jun-2011 9:00 8w^201 . Discharge
4 31-May-2011 9:00 EIAB^EIAB^9 . Admission
4 31-May-2011 7:00 10E^45 EIAB^EIAB^9 Transfer to 10E
4 1-Jun-2011 8:00 10E^45 . Death
;
* Fill in the ranges to get data by hour;
data hours(keep=id in_unit in_room at_dt);
set have;
by id;
retain at_dt in_unit in_room;
if first.id then do;
at_dt = datetime;
in_unit = unit;
in_room = room;
end;
else do;
do at_dt = at_dt to datetime-1 by dhms(0,1,0,0);
output;
end;
in_unit = unit;
in_room = room;
end;
format at_dt datetime16.;
run;
* prepare for transposition;
proc sort data=hours;
by at_dt in_unit in_room id;
run;
* transpose to know which time/unit/room has multiple patients;
proc transpose data=hours out=roomies_by_hour(drop=_name_ where=(not missing(patid2))) prefix=patid;
by at_dt in_unit in_room ;
var id;
run;
* 'unfill' the individual hours to get ranges again;
data roomies;
set roomies_by_hour;
by in_unit in_room patid1 patid2;
retain start_dt end_dt;
format start_dt end_dt datetime16.;
if first.patid2 then
start_dt = at_dt;
if last.patid2 then do;
end_dt = at_dt;
length_hrs = intck('hours', start_dt, end_dt);
output;
end;
run;
* stack data flipping perspective of who shared with who;
data roomies_mirrored;
set
roomies /* patid1 centric */
roomies(rename=(patid1=patid2 patid2=patid1)) /* patid2 centric */
;
run;
proc sort data=roomies_mirrored;
by patid1 start_dt;
run;
Upvotes: 0
Reputation: 3845
We need to process the data by location
proc sort HAVE;
by assigned_pat_loc data time;
run;
In the result, we don not need temporary variables (starting with underscore) and the date and time must be renamed to end_date and end_time.
data WANT (drop= _: rename=(date=end_date time=end_time));
set HAVE;
by assigned_pat_loc data time;
I generalize the problem to rooms with a capacity above 2 and use arrays. Extending the temporary arrays beyond &max_patients, saves me a few if-statements. Note that temporary arrays are dropped in the result and are retained anyway.
%let max_patients = 9;
array id_r {&max_patients - 1} id_1 - id_%eval(&max_patients - 1);
array patients temporary {&max_patients + 1};
array admissions temporary {&max_patients + 1};
if _N_ eq 1 then patient_count = 0;
retain patient_count;
for every pat_loc, start all over
if first.assigned_pat_loc then do;
do patient_nr = 1 to patient_count;
patients[patient_nr] = .;
end;
patient_count = 0;
end;
if a patient leaves, calculate the time she spent
if Activity in (“Discharge”, “Death”) then do;
_found_patient = 0;
do _patient_nr = 1 to patient_count;
if patients[_patient_nr] eq id then do;
start_date = datepart(admissions[_patient_nr]);
start_time = timepart(admissions[_patient_nr]);
duration = (dhms(date,0,0,time) - admissions[_patient_nr]) / 3600;
_found_patient = 1;
end;
end;
shift the patients that arrived later
if _found_patient then do;
patients[_patient_nr] = patients[_patient_nr + 1];
admissions[_patient_nr] = admissions[_patient_nr + 1];
end;
patient_count = patient_count - 1;
find out who else was in the pat_loc and write the result
do _patient_nr = 1 to patient_count;
id_r[_patient_nr] = patents[_patient_nr];
end;
output;
end;
if a patient arrives, register that for later
else do;
patient_count = patient_count + 1;
patients[_patient_nr] = id;
admissions[_patient_nr] = dhms(date,0,0,time);
end;
run;
sort the results
proc sort;
by id start_date start_time;
run;
Disclaimer: this is a draft, which might need debugging.
Upvotes: 0