Reputation: 471
I have a table with daily reports for given units that can be simplified to:
Unit | Status | Day 1 | On | 10 March 2016 2 | On | 10 March 2016 3 | On | 10 March 2016 4 | On | 10 March 2016 1 | Off | 11 March 2016 2 | On | 11 March 2016 3 | On | 11 March 2016 4 | On | 11 March 2016 . . .
I'm looking to query which units have reported 'Off' for the last 7 days but not really sure how to do it. I would like the query to be something like
SELECT
Unit,
Status,
CASE WHEN([all 7 previous days were 'Off']) THEN 'Dead' ELSE 'Alive' END
FROM unit_table a
WHERE Day = '15 Sept 2016'
How could I do that IF clause? Something like
CASE WHEN(SELECT COUNT(SELECT * FROM unit_table b WHERE DAY > '08 Sept 2016' AND Status = 'Off' AND b.Unit = a.Unit) > 7)
maybe?
I'm aware that 'DAY > '08 Sept 2016'' won't work natively, there's some internal functions used for comparing and updating dates. This is there just for explanatory purposes.
Upvotes: 3
Views: 78
Reputation: 25903
The WINDOW Frame version version of COUNT_IF ( COUNT with a nested IFF can also be used), solves this problem:
With this great CTE of data to use:
WITH table_data(unit, status, day) AS (
SELECT column1, column2, column3::date FROM VALUES
(1, 'On', '2016-03-10'),
(1, 'On', '2016-03-11'),
(1, 'On', '2016-03-12'),
(1, 'On', '2016-03-13'),
(1, 'On', '2016-03-14'),
(1, 'On', '2016-03-15'),
(1, 'On', '2016-03-16'),
(1, 'On', '2016-03-17'),
(1, 'On', '2016-03-18'),
(2, 'Off', '2016-03-10'),
(2, 'Off', '2016-03-11'),
(2, 'Off', '2016-03-12'),
(2, 'Off', '2016-03-13'),
(2, 'Off', '2016-03-14'),
(2, 'Off', '2016-03-15'),
(2, 'Off', '2016-03-16'),
(2, 'Off', '2016-03-17'),
(2, 'Off', '2016-03-18'),
(3, 'Off', '2016-03-10'),
(3, 'Off', '2016-03-11'),
(3, 'Off', '2016-03-12'),
(3, 'Off', '2016-03-13'),
(3, 'Off', '2016-03-14'),
(3, 'On', '2016-03-15'),
(3, 'Off', '2016-03-16'),
(3, 'Off', '2016-03-17'),
(3, 'Off', '2016-03-18')
)
We can use:
SELECT *
,COUNT_IF(status = 'Off') over (PARTITION BY unit ORDER BY day ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as c_count_if_off
,IFF(c_count_if_off = 7, 'Dead', 'Alive') AS seven_day_status
FROM table_data
ORDER BY 1,3;
and get:
UNIT | STATUS | DAY | C_COUNT_IF_OFF | SEVEN_DAY_STATUS |
---|---|---|---|---|
1 | On | 2016-03-10 | 0 | Alive |
1 | On | 2016-03-11 | 0 | Alive |
1 | On | 2016-03-12 | 0 | Alive |
1 | On | 2016-03-13 | 0 | Alive |
1 | On | 2016-03-14 | 0 | Alive |
1 | On | 2016-03-15 | 0 | Alive |
1 | On | 2016-03-16 | 0 | Alive |
1 | On | 2016-03-17 | 0 | Alive |
1 | On | 2016-03-18 | 0 | Alive |
2 | Off | 2016-03-10 | 1 | Alive |
2 | Off | 2016-03-11 | 2 | Alive |
2 | Off | 2016-03-12 | 3 | Alive |
2 | Off | 2016-03-13 | 4 | Alive |
2 | Off | 2016-03-14 | 5 | Alive |
2 | Off | 2016-03-15 | 6 | Alive |
2 | Off | 2016-03-16 | 7 | Dead |
2 | Off | 2016-03-17 | 7 | Dead |
2 | Off | 2016-03-18 | 7 | Dead |
3 | Off | 2016-03-10 | 1 | Alive |
3 | Off | 2016-03-11 | 2 | Alive |
3 | Off | 2016-03-12 | 3 | Alive |
3 | Off | 2016-03-13 | 4 | Alive |
3 | Off | 2016-03-14 | 5 | Alive |
3 | On | 2016-03-15 | 5 | Alive |
3 | Off | 2016-03-16 | 6 | Alive |
3 | Off | 2016-03-17 | 6 | Alive |
3 | Off | 2016-03-18 | 6 | Alive |
The COUNT_IF COUNT_IF(status = 'Off')
is the same as COUNT(IFF(status = 'Off',1,null))
or SUM(IFF(status = 'Off',1,0))
And the whole thing can be mashed together into a single line:
SELECT *
,IFF(COUNT_IF(status = 'Off') over (PARTITION BY unit ORDER BY day ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) = 7, 'Dead', 'Alive') AS seven_day_status
FROM table_data
ORDER BY 1,3;
I would use MATCH_RECONGNIZE of some other manual form of edge detection.
Upvotes: 0
Reputation: 4719
Gordon, thanks for the answer, your second query should work. Small correction - date >=
should probably be day >=
.
The queries with DISTINCT
can often be slower (due to duplicate elimination), so if it is guaranteed that for one unit/day combination there is only 1 record, removing it will work.
If it's possible to have more than 1 record per unit per day, Gordon's query will report the unit as dead if it reported Off
at least once per day (even if there were some On
reports). If one wants logic where the unit only reported Off
in the last 7 days, here's an alternative
select u.unit
from unit_table u
where day >= dateadd(day, -6, CURRENT_DATE)
group by u.unit
having count(nullif(u.status, 'Off')) = 0;
Upvotes: 1
Reputation: 1269483
Hmmm, I'm thinking something like this:
select u.unit
from unit_table u
where date >= '2016-09-15' - interval '6 day'
group by u.unit
having count(distinct case when status = 'Off' then day end) = 7;
I'm not intimately familiar with Snowflake, but the syntax might look like:
select u.unit
from unit_table u
where date >= dateadd(day, -6, CURRENT_DATE)
group by u.unit
having count(distinct case when u.status = 'Off' then day end) = 7;
Upvotes: 1