user3642531
user3642531

Reputation: 319

Collapsing date ranges in SQL (Netezza)

I have a data set that looks something like this:

Visit ID   Admission Date       Discharge Date       Unit
20         01/01/2015 12:45     01/01/2015 13:57     ER
20         01/03/2015 13:57     01/04/2015 11:57     ER
20         01/04/2015 11:57     01/04/2015 19:32     Trauma
20         01/04/2015 19:32     01/04/2015 21:22     ER

My goal is to get admission/discharge dates for each unit. The issue is that sometimes patients change beds within in the same unit, and this action is labeled as a transfer even though the patient is still within the same unit. So I would like to collapse those date ranges so the output instead looks like this:

Visit ID   Admission Date       Discharge Date       Unit
20         01/01/2015 12:45     01/01/2015 11:57     ER
20         01/04/2015 11:57     01/04/2015 19:32     Trauma
20         01/04/2015 19:32     01/04/2015 21:22     ER

I don't know how to achieve this...I was thinking of what partition I should use but every ranking partition I can think of (rank/dense_rank) will assign the first two ER values the same rank as the last ER value, which would be incorrect.

Basically, my question is the same this is unanswered question: Collapsing date records only if the value doesn't change - Oracle SQL

I'm using Netezza.

Upvotes: 1

Views: 389

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269693

You can use a left join to see if something is connected to the previous record. If there is no connection, then you have the beginning of a "continuous period". Then, a cumulative sum assigns a grouping, which can be used for aggregation.

That is how this query works:

select visitid, unit,
       min(admissiondate) as admissiondate,
       max(dischargedate) as dischargedate
from (select t.*,
             sum(case when tprev.visitid is null then 1 else 0 end) over
                 (partition by t.visitid, t.unit order by t.admissiondate 
                 ) as grp
      from t left join
           t tprev
           on t.visitid = tprev.visitid and t.unit = tprev.unit and
              t.admissiondate = tprev.dischargedate
     ) t
group by grp, visitid, unit;

Note: This assumes that the new admission date is exactly the same as the previous discharge date. Of course, you can introduce non-equality logic if you want to check that the admission occurred within, say 10 seconds or 5 minutes of the discharge.

Upvotes: 1

Related Questions