Cur123
Cur123

Reputation: 89

How do I group by a date range?

I have 3 fields: id, date, treatment. There are 3 types of treatment: Cold, fever, cholera. Assume there are 1000 patients and the first patient's data looks like this

pt    treatment_date   treatment
A     05-05-2017          Cold
A     05-07-2017          Cold
A     05-09-2017          Fever
A     05-13-2017          Fever
A     05-15-2017          Cholera
A     05-17-2017          Cholera
A     05-19-2017          Cold
A     05-21-2017          Cold
A     05-23-2017          Fever

I need my output to look like this-

pt    start_date   end_date    treatment   Number_of_days  Conversion_date    Days_before_cholera(start date of cholera- end date of treatment immediately before it)
 A    05-05-2017   05-07-2017   Cold           2               0               0       
 A    05-09-2017   05-13-2017   Fever          4               0               0
 A    05-15-2017   05-17-2017   Cholera        2              05-13-2017       2
 A    05-19-2017   05-21-2017   Cold           2                0              0
 A    05-23-2017   05-23-2017   Fever          1                0              0

So goes on for all patient_ids.

Upvotes: 0

Views: 268

Answers (2)

Joshua Schlichting
Joshua Schlichting

Reputation: 3450

You're going to need to join the table to itself for this one. I'd try something along these lines.

SELECT
    a.pt
    ,a.treatment
    ,a.treatment_date AS start_date
    ,CASE    /*this is for your last fever row with the same date*/
        WHEN b.treatment_date IS NULL
        THEN a.treatment_date
        ELSE b.treatment_date
        END AS end_date
    /*other fields here*/

FROM
    MyTable a
    LEFT JOIN MyTable b
    ON a.pt = b.pt
    AND a.treatment = b.treatment
WHERE
    a.treatment_date < b.treatment_date
    /*make sure there isn't any date in between, 
      this should stop you from joining rows you didn't intend on joining on*/
    AND NOT EXISTS (
                        SELECT
                            x.treatment_date
                        FROM
                            MyTable x
                        WHERE
                            a.pt = x.pt
                            AND a.treatment = x.treatment
                            AND x.treatment_date < b.treatment_date
                            AND x.treatment_date > a.treatment_date
                    )

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269443

This is a "gaps-and-islands" problem. I show you have to handle the calculation of the rows. You can fill in the additional columns.

One way to solve it is using the difference of row numbers:

select pt, min(treatment_date), max(treatment_date), . . .
from (select t.*,
             row_number() over (partition by pt order by treatment_date) as seqnum_p,
             row_number() over (partition by pt, treatment order by treatment_date) as seqnum_ptt
      from t
     ) t
group by pt, (seqnum_p - seqnum_ptt);

Upvotes: 1

Related Questions