Reputation: 13
I have to count number of status changes but only if the time difference from one status to another is less then 30 minutes. In my database i have current time and previous time column i made by using over partition. Here is my query but I am getting the error: "Selected non-aggregated values must be part of the associated group". Could anyone help ?
select col1, col2,
MAX(creation_dt_utc) OVER(PARTITION BY col1,col2,col3 ORDER BY creation_dt ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS previous_creation_dt,
(creation_dt - prev_creation_dt) DAY(4) TO SECOND(6) as time_difference,
EXTRACT(DAY FROM time_difference) * 24*60 + EXTRACT(HOUR FROM time_difference) * 60 + EXTRACT(MINUTE FROM time_difference) AS Total_Minutes
SUM(
CASE WHEN status_previous='Test1'
and status_current='Test2' THEN 1
ELSE
CASE WHEN status_previous='Test3'
and status_current='Test2' THEN 1
ELSE
CASE WHEN status_previous='Test4'
and status_current='Test2' THEN 1
ELSE 0
END
END
END
) AS "Total_Change"
from myTable
qualify Total_Minutes<30
where EXTRACT(YEAR from year_column)='2017';
Upvotes: 1
Views: 542
Reputation: 29667
I think that QUALIFY should be after the WHERE clause.
And for a previous value, I assume LAG would be more fitting than a MAX.
And those nested CASE
can be written as 1 CASE
.
Because once a WHEN
condition is met, it won't check the other WHEN conditions after it.
Since a normal SUM is used, there should be a GROUP BY.
SELECT col1, col2,
COUNT(*) AS Total,
SUM(TimeDiffMinutes) AS Total_Minutes,
SUM(CASE WHEN StatusChanged = 1 THEN TimeDiffMinutes ELSE 0 END) AS Total_Minutes_Change,
COUNT(CASE WHEN StatusChanged = 1 THEN 1 END) AS Total_Change
FROM
(
SELECT col1, col2, col3, creation_dt,
(CASE
WHEN status_previous='Test1' and status_current='Test2' THEN 1
WHEN status_previous='Test3' and status_current='Test2' THEN 1
WHEN status_previous='Test4' and status_current='Test2' THEN 1
ELSE 0
END) AS StatusChanged,
LAG(creation_dt) OVER (PARTITION BY col1, col2, col3 ORDER BY creation_dt) AS prev_creation_dt,
(creation_dt - prev_creation_dt) DAY(4) TO SECOND(6) AS time_difference,
EXTRACT(DAY FROM time_difference)*(24*60) + EXTRACT(HOUR FROM time_difference)*60 + EXTRACT(MINUTE FROM time_difference) AS TimeDiffMinutes
FROM myTable
WHERE EXTRACT(YEAR from year_column) = '2017'
QUALIFY (creation_dt - prev_creation_dt) day(4) to second(6) < interval '30' minute
) q
GROUP BY col1, col2
ORDER BY col1, col2
Upvotes: 0
Reputation: 60472
Analytic funtions are processed after aggregation (where-from-group by-having-olap-qualify-order by), thus you can't apply an aggregate on the result of an OVER, you mst nest it in a Derived Tabe or a Common Table Expression:
SELECT
Sum(
CASE WHEN (status_previous='Test1' AND status_current='Test2')
OR (status_previous='Test3' AND status_current='Test2')
OR (status_previous='Test4' AND status_current='Test2')
THEN 1
ELSE 0
END) AS "Total_Change"
FROM
(
SELECT col1, col2,
Max(creation_dt_utc)
Over(PARTITION BY col1,col2,col3
ORDER BY creation_dt
ROWS BETWEEN 1 Preceding AND 1 Preceding) AS previous_creation_dt,
(creation_dt - prev_creation_dt) DAY(4) TO SECOND(6) AS time_difference,
Extract(DAY From time_difference) * 24*60 + Extract(HOUR From time_difference) * 60 + Extract(MINUTE From time_difference) AS Total_Minutes
FROM myTable
WHERE Extract(YEAR From year_column)=2017 -- the result of EXTRACT is an INTEGER, not a string
QUALIFY Total_Minutes<30
) AS dt
But as you only want a count you could move the CASE into QUALIFY:
SELECT Count(*) AS "Total_Change"
FROM
(
SELECT col1, col2,
Max(creation_dt_utc)
Over(PARTITION BY col1,col2,col3
ORDER BY creation_dt
ROWS BETWEEN 1 Preceding AND 1 Preceding) AS previous_creation_dt,
(creation_dt - prev_creation_dt) DAY(4) TO SECOND(6) AS time_difference,
Extract(DAY From time_difference) * 24*60 + Extract(HOUR From time_difference) * 60 + Extract(MINUTE From time_difference) AS Total_Minutes
FROM myTable
WHERE Extract(YEAR From year_column)=2017 -- the result of EXTRACT is an INTEGER, not a string
QUALIFY Total_Minutes<30
AND ( (status_previous='Test1' AND status_current='Test2')
OR (status_previous='Test3' AND status_current='Test2')
OR (status_previous='Test4' AND status_current='Test2')
)
) AS dt
Edit:
The CASE logic can be further simplified to:
CASE WHEN status_current='Test2' and status_previous IN ('Test1','Test3','Test4')
THEN 1
ELSE 0
END
or maybe
CASE WHEN status_current='Test2' and status_previous <>'Test2'
THEN 1
ELSE 0
END
Upvotes: 1