shakeel ahmad
shakeel ahmad

Reputation: 73

How to transform Table data to another Table format in SAS

I am stuck in transforming the data table from one format to another format using the SAS Programming function. The structure of the Table is given as below:

id  Date        Time    assigned_pat_loc    prior_pat_loc       Activity
1   May/31/11   8:00    EIAB^EIAB^6                             Admission
1   May/31/11   9:00    8w^201               EIAB^EIAB^6      Transfer to 8w
1   Jun/8/11    15:00   8w^201                                  Discharge
2   May/31/11   5:00    EIAB^EIAB^4                             Admission 
2   May/31/11   7:00    10E^45               EIAB^EIAB^4    Transfer to 10E
2   Jun/1/11    1:00    8w^201                  10E^45      Transfer to 8w
2   Jun/1/11    8:00    8w^201                                 Discharge
3   May/31/11   9:00    EIAB^EIAB^2                            Admission
3   Jun/1/11    9:00    8w^201               EIAB^EIAB^2    Transfer to 8w
3   Jun/5/11    9:00    8w^201                                 Discharge
4   May/31/11   9:00    EIAB^EIAB^9                           Admission
4   May/31/11   7:00    10E^45               EIAB^EIAB^9    Transfer to 10E
4   Jun/1/11    8:00    10E^45                                     Death

1

I want to transform it into the following table. enter image description here

Here are the details of the variables.

Please anyone guide me how it could be done?

Upvotes: 0

Views: 457

Answers (2)

Richard
Richard

Reputation: 27508

When dealing with ranges in which there is a possibility of an unexpected overlap case you can enumerate over the range and perform simpler logic for finding shared time/unit/room.

Example:

data have;

length id date time 8 loc ploc $20 activity $10;
input 
id  Date& date11.   Time time5.   loc    ploc       Activity;
format date date9. time time5.;

datetime = dhms (date,0,0,0) + time;

length unit room bed punit proom pbed $4;

unit = scan(loc,1,'^');
room = scan(loc,2,'^');
bed  = scan(loc,3,'^');

punit = scan(ploc,1,'^');
proom = scan(ploc,2,'^');
pbed  = scan(ploc,3,'^');

drop loc ploc;

datalines;
1   31-May-2011   8:00    EIAB^EIAB^6          .                Admission
1   31-May-2011   9:00    8w^201               EIAB^EIAB^6      Transfer to 8w
1    8-Jun-2011  15:00    8w^201               .                Discharge
2   31-May-2011   5:00    EIAB^EIAB^4          .                Admission 
2   31-May-2011   7:00    10E^45               EIAB^EIAB^4      Transfer to 10E
2    1-Jun-2011   1:00    8w^201               10E^45           Transfer to 8w
2    1-Jun-2011   8:00    8w^201               .                Discharge
3   31-May-2011   9:00    EIAB^EIAB^2          .                Admission
3    1-Jun-2011   9:00    8w^201               EIAB^EIAB^2      Transfer to 8w
3    5-Jun-2011   9:00    8w^201               .                Discharge
4   31-May-2011   9:00    EIAB^EIAB^9          .                Admission
4   31-May-2011   7:00    10E^45               EIAB^EIAB^9      Transfer to 10E
4    1-Jun-2011   8:00    10E^45               .                Death
;


* Fill in the ranges to get data by hour;

data hours(keep=id in_unit in_room at_dt);
  set have;
  by id;

  retain at_dt in_unit in_room;

  if first.id then do;
    at_dt = datetime;
    in_unit = unit;
    in_room = room;
  end;
  else do;
    do at_dt = at_dt to datetime-1 by dhms(0,1,0,0);
      output;
    end;
    in_unit = unit;
    in_room = room;
  end;

  format at_dt datetime16.;
run;

* prepare for transposition;

proc sort data=hours;
  by at_dt in_unit in_room id;
run;

* transpose to know which time/unit/room has multiple patients;

proc transpose data=hours out=roomies_by_hour(drop=_name_ where=(not missing(patid2))) prefix=patid;
  by at_dt in_unit in_room ;
  var id;
run;

* 'unfill' the individual hours to get ranges again;

data roomies;
  set roomies_by_hour;
  by in_unit in_room patid1 patid2;

  retain start_dt end_dt;
  format start_dt end_dt datetime16.;

  if first.patid2 then
    start_dt = at_dt;

  if last.patid2 then do;
    end_dt = at_dt;
    length_hrs = intck('hours', start_dt, end_dt);
    output;
  end;
run;

* stack data flipping perspective of who shared with who;

data roomies_mirrored;
  set 
    roomies                                         /* patid1 centric */
    roomies(rename=(patid1=patid2 patid2=patid1))   /* patid2 centric */
  ;
run;

proc sort data=roomies_mirrored;
  by patid1 start_dt;
run;

Upvotes: 0

Dirk Horsten
Dirk Horsten

Reputation: 3845

We need to process the data by location

proc sort HAVE;
  by assigned_pat_loc data time;
run;

In the result, we don not need temporary variables (starting with underscore) and the date and time must be renamed to end_date and end_time.

data WANT (drop= _: rename=(date=end_date time=end_time));
  set HAVE;
  by assigned_pat_loc data time;

I generalize the problem to rooms with a capacity above 2 and use arrays. Extending the temporary arrays beyond &max_patients, saves me a few if-statements. Note that temporary arrays are dropped in the result and are retained anyway.

  %let max_patients = 9;
  array id_r {&max_patients - 1} id_1 - id_%eval(&max_patients - 1); 
  array patients temporary {&max_patients + 1}; 
  array admissions temporary {&max_patients + 1}; 
  if _N_ eq 1 then patient_count = 0;
  retain patient_count; 

for every pat_loc, start all over

  if first.assigned_pat_loc then do;
    do patient_nr = 1 to patient_count;
      patients[patient_nr] = .;
    end;
    patient_count = 0;
  end;

if a patient leaves, calculate the time she spent

  if Activity in (“Discharge”, “Death”) then do;
    _found_patient = 0;
    do _patient_nr = 1 to patient_count;
      if patients[_patient_nr] eq id then do;
        start_date = datepart(admissions[_patient_nr]);
        start_time = timepart(admissions[_patient_nr]);
        duration = (dhms(date,0,0,time) - admissions[_patient_nr]) / 3600;
        _found_patient = 1;
      end;
    end;

shift the patients that arrived later

    if _found_patient then do;
      patients[_patient_nr] = patients[_patient_nr + 1];
      admissions[_patient_nr] = admissions[_patient_nr + 1];
    end;
    patient_count = patient_count - 1;

find out who else was in the pat_loc and write the result

    do _patient_nr = 1 to patient_count;
      id_r[_patient_nr] = patents[_patient_nr];
    end;
    output;
  end;

if a patient arrives, register that for later

  else do;
    patient_count = patient_count + 1;
    patients[_patient_nr] = id;
    admissions[_patient_nr] = dhms(date,0,0,time); 
  end;
run;

sort the results

proc sort;
  by id start_date start_time;
run;

Disclaimer: this is a draft, which might need debugging.

Upvotes: 0

Related Questions