craig
craig

Reputation: 26262

Group items in a way that sets of repeating values are segmented

I have a table of arrivals and departure for a patient's stay:

ENC_ID    ARRIVE                  DEPART                  UNIT      LEVEL
123456789 2018-07-16 17:53:00.000 2018-07-17 06:30:00.000 ICU       TRAUMA
123456789 2018-07-17 06:30:00.000 2018-07-17 09:05:00.000 PERI OP   TRAUMA
123456789 2018-07-17 09:05:00.000 2018-07-18 09:06:00.000 ICU       TRAUMA
123456789 2018-07-18 23:14:00.000 2018-07-23 07:33:00.000 UNIT 5    NULL
123456789 2018-07-23 07:33:00.000 2018-07-23 14:57:00.000 ICU       TRAUMA
123456789 2018-07-23 14:57:00.000 2018-07-30 11:06:00.000 INTRA OP  TRAUMA
123456789 2018-07-30 11:06:00.000 2018-07-31 11:06:00.000 UNIT 5    NULL

I need to consolidate the records based on the groups of LEVEL:

ENC_ID    MIN(ARRIVE)         MAX(DEPART)         LEVEL
123456789 2018-07-16 17:53:00 2018-07-18 09:06:00 TRAUMA
123456789 2018-07-18 23:14:00 2018-07-23 07:33:00 NULL
123456789 2018-07-23 07:33:00 2018-07-30 11:06:00 TRAUMA
123456789 2018-07-30 11:06:00 2018-07-31 11:06:00 NULL

I hoping using DENSE_RANK to create a SEQ number for each LEVEL set that I could later use to GROUP BY:

ENC_ID    ARRIVED                 DEPARTED                UNIT     LEVEL   SEQ
159939879 2018-07-16 17:53:00.000 2018-07-17 06:30:00.000 ICU      TRAUMA  1
159939879 2018-07-17 06:30:00.000 2018-07-17 09:05:00.000 PERI OP  TRAUMA  1
159939879 2018-07-17 09:05:00.000 2018-07-18 09:06:00.000 ICU      TRAUMA  1
159939879 2018-07-18 23:14:00.000 2018-07-23 07:33:00.000 UNIT 5   NULL    2
159939879 2018-07-23 07:33:00.000 2018-07-23 14:57:00.000 ICU      TRAUMA  3
159939879 2018-07-23 14:57:00.000 2018-07-30 11:06:00.000 INTRA OP TRAUMA  3
159939879 2018-07-30 11:06:00.000 2018-07-31 11:06:00.000 UNIT 5    NULL   4

but DENSE_RANK() over (partition by ENC_ID order by LEVEL) it doesn't differentiate the LEVEL sets in a way that I can use:

ENC_ID    ARRIVED                 DEPARTED                UNIT     LEVEL   DR
159939879 2018-07-16 17:53:00.000 2018-07-17 06:30:00.000 ICU      TRAUMA  2
159939879 2018-07-17 06:30:00.000 2018-07-17 09:05:00.000 PERI OP  TRAUMA  2
159939879 2018-07-17 09:05:00.000 2018-07-18 09:06:00.000 ICU      TRAUMA  2
159939879 2018-07-18 23:14:00.000 2018-07-23 07:33:00.000 UNIT 5   NULL    1
159939879 2018-07-23 07:33:00.000 2018-07-23 14:57:00.000 ICU      TRAUMA  2
159939879 2018-07-23 14:57:00.000 2018-07-30 11:06:00.000 INTRA OP TRAUMA  2
159939879 2018-07-30 11:06:00.000 2018-07-31 11:06:00.000 UNIT 5    NULL   1

Is there a way to achieve this?

Upvotes: 0

Views: 28

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270091

This is a gaps-and-islands problem. You want to know how many changes. You can use lag() and a cumulative sum:

select t.*,
       sum(case when prev_level = level then 0 else 1 end) over (partition by enc_id order by arrived)
from (select t.*,
             lag(level) over (partition by ENC_ID order by ARRIVED) as prev_level
      from t
     ) t;

You can then aggregate by this value.

Note the above may not handle adjacent NULL values as you want. So, a difference of row numbers might be better:

select enc_id, min(arrive), max(depart), level,
       row_number() over (order by min(arrived))
from (select t.*,
             row_number() over (partition by ENC_ID, level order by ARRIVED) as seqnum_l,
             row_number() over (partition by ENC_ID order by ARRIVED) as seqnum
      from t
     ) t
group by enc_id, (seqnum - seqnum_l), level;

Upvotes: 1

Related Questions