user3642531
user3642531

Reputation: 319

Giving a common value to groups of consecutive hours in SQL

I am using Netezza.

Let's say I have a table with two fields: one field is a timestamp corresponding to every hour in the day, the other is an indicator for whether or not a patient took an antacid during the hour. The table looks as follows:

Timestamp           Antacid?
11/23/2016 08:00          1
11/23/2016 09:00          1
11/23/2016 10:00          1
11/23/2016 11:00          0
11/23/2016 12:00          0
11/23/2016 13:00          1
11/23/2016 14:00          1
11/23/2016 15:00          0

Is there a way to assign a common partition value to each set of consecutive hour intervals? Something like this...

Timestamp           Antacid?      Group
11/23/2016 08:00          1           1
11/23/2016 09:00          1           1
11/23/2016 10:00          1           1
11/23/2016 11:00          0        NULL
11/23/2016 12:00          0        NULL
11/23/2016 13:00          1           2
11/23/2016 14:00          1           2
11/23/2016 15:00          0        NULL

I would ultimately like to figure out the start date and end date for all consecutive hours of antacid usage (so the start and end dates for the first group would be 11/23/2016 08:00 and 11/23/2016 10:00 respectively, and the start/end dates for the second group would be 11/23/2016 13:00 and 11/23/2016 14:00, respectively). I have done this before with consecutive days using extract(epoch from date - row_number()) but I'm not sure how to handle hours.

Upvotes: 0

Views: 333

Answers (1)

Vamsi Prabhala
Vamsi Prabhala

Reputation: 49260

I assume this has to be done for each patient (id in the query here). You can use

select id,antacid,min(dt) startdate,max(dt) enddate from (
select t.*,
-row_number() over(partition by id,antacid order by dt) 
+ row_number() over(partition by id order by dt) grp
from t
) x
where antacid = 1
group by id,antacid,grp
order by 1,3

The inner query gets you the continuous groups of 0 and 1 for antacid for a given patient id. Because you only need the start and end dates for antacid=1, you can use a where clause to filter.

Add partition by date if this has to be done for each day.

Edit: Grouping rows only if the difference between the current row and the next row is one hour.

select id,antacid,min(dt) startdate,max(dt) enddate from (
select t.*,
--change dateadd as per Netezza functions so you add -row_number hours
dateadd(hour,-row_number() over(partition by id,antacid order by dt),dt) grp
from t
) x
where antacid = 1
group by id,antacid,grp
order by 1,3

Upvotes: 1

Related Questions