Reputation: 25
I'm monitoring the effectiveness of a temperature controller in an IoT application. I'm trying to find "interesting points" in the time series. These are similar to local minima or maxima but include points on the curve when a trend is beginning. It's not just min and max. It's where a range of 5 points occurred within an hour.
Source
| inMins | unixTime | temp |
|--------|------------------|------|
| 0 | 1479042000000000 | 10.0 |
| 5 | 1479042300000000 | 11.0 |
| 10 | 1479042600000000 | 12.0 |
| 15 | 1479042600000000 | 13.0 |
| 20 | 1479043200000000 | 14.0 |
| 25 | 1479043500000000 | 15.0 |
| 30 | 1479043800000000 | 14.0 |
| 35 | 1479044100000000 | 13.0 |
| 40 | 1479044400000000 | 12.0 |
| 45 | 1479044700000000 | 11.0 |
| 50 | 1479045000000000 | 10.0 |
| 55 | 1479045300000000 | 9.0 |
| 60 | 1479045600000000 | 8.0 |
| 65 | 1479045900000000 | 9.0 |
| 70 | 1479046200000000 | 10.0 |
| 75 | 1479046500000000 | 11.0 |
| 80 | 1479046800000000 | 12.0 |
| 85 | 1479047100000000 | 13.0 |
| 90 | 1479047400000000 | 14.0 |
Desired Shape
| inMins | unixTime | temp | coldOrHot |
|--------|------------------|------|-----------|
| 0 | 1479042000000000 | 10.0 | 1 |
| 25 | 1479043500000000 | 15.0 | 2 |
| 30 | 1479043800000000 | 14.0 | 2 |
| 35 | 1479044100000000 | 13.0 | 2 |
| 60 | 1479045600000000 | 8.0 | 1 |
| 65 | 1479045900000000 | 9.0 | 1 |
My current results have some problems
| inMins | unixTime | temp | coldOrHot |
|--------|------------------|------|-----------|
| 25 | 1479043500000000 | 15.0 | 2 |
| 30 | 1479043800000000 | 14.0 | 2 |
| 60 | 1479045600000000 | 8.0 | 1 |
| 65 | 1479045900000000 | 9.0 | 1 |
| 70 | 1479046200000000 | 10.0 | 1 |
| 75 | 1479046500000000 | 11.0 | 1 |
| 80 | 1479046800000000 | 12.0 | 1 |
| 85 | 1479047100000000 | 13.0 | 1 |
| 90 | 1479047400000000 | 14.0 | 1 |
SQL
Select
inMins,
unixTime,
temp,
coldOrHot
from
(Select
inMins,
unixTime,
temp,
-- 1 means Cold, 2 means Hot, 0 is noise
if(temp=theLowInWindowDesc,1,
if(temp=theHighInWindowDesc,2,0)) as coldOrHot,
theHighInWindowDesc,
theLowInWindowDesc
FROM
(SELECT
inMins,
unixTime,
temp,
theHighInWindowDesc,
theLowInWindowDesc
FROM
(Select
inMins,
unixTime,
temp,
MAX(temp) OVER(ORDER BY
unixTime desc RANGE BETWEEN 60 * 60 * 1000000 PRECEDING
AND CURRENT ROW) AS theHighInWindowDesc,
MIN(temp) OVER(ORDER BY
unixTime desc RANGE BETWEEN 60 * 60 * 1000000 PRECEDING
AND CURRENT ROW) AS theLowInWindowDesc
FROM
[esheetzbq:findingLocalExtrema.timeSeriesForKevin]
ORDER BY
inMins asc
)
)
)
where coldOrHot=1 or coldOrHot=2
Problems
Upvotes: 2
Views: 104
Reputation: 173028
Here we go. Below is for BigQuery Standard SQL
I havent done any attempts to improve/optimize query - rather purposely kept it "broken" to subqueries exactly the way i wrote them - to make sure the logic is easily traced and thus understood
I have included below data for easy testing but you can comment out data portion if you want to test it on real data
Have fun :o)
#standardSQL
WITH `esheetzbq.findingLocalExtrema.timeSeriesForKevin` AS (
SELECT 0 AS inMins, 1479042000000000 AS unixTime, 10.0 AS temp UNION ALL SELECT 5 AS inMins, 1479042300000000 AS unixTime, 11.0 AS temp UNION ALL SELECT 10 AS inMins, 1479042600000000 AS unixTime, 12.0 AS temp UNION ALL SELECT 15 AS inMins, 1479042900000000 AS unixTime, 13.0 AS temp UNION ALL SELECT 20 AS inMins, 1479043200000000 AS unixTime, 14.0 AS temp UNION ALL SELECT 25 AS inMins, 1479043500000000 AS unixTime, 15.0 AS temp UNION ALL SELECT 30 AS inMins, 1479043800000000 AS unixTime, 14.0 AS temp UNION ALL SELECT 35 AS inMins, 1479044100000000 AS unixTime, 13.0 AS temp UNION ALL
SELECT 40 AS inMins, 1479044400000000 AS unixTime, 12.0 AS temp UNION ALL SELECT 45 AS inMins, 1479044700000000 AS unixTime, 11.0 AS temp UNION ALL SELECT 50 AS inMins, 1479045000000000 AS unixTime, 10.0 AS temp UNION ALL SELECT 55 AS inMins, 1479045300000000 AS unixTime, 9.0 AS temp UNION ALL SELECT 60 AS inMins, 1479045600000000 AS unixTime, 8.0 AS temp UNION ALL SELECT 65 AS inMins, 1479045900000000 AS unixTime, 9.0 AS temp UNION ALL SELECT 70 AS inMins, 1479046200000000 AS unixTime, 10.0 AS temp UNION ALL SELECT 75 AS inMins, 1479046500000000 AS unixTime, 11.0 AS temp UNION ALL SELECT 80 AS inMins, 1479046800000000 AS unixTime, 12.0 AS temp UNION ALL SELECT 85 AS inMins, 1479047100000000 AS unixTime, 13.0 AS temp UNION ALL SELECT 90 AS inMins, 1479047400000000 AS unixTime, 14.0 AS temp UNION ALL SELECT 95 AS inMins, 1479047700000000 AS unixTime, 15 AS temp UNION ALL SELECT 100 AS inMins, 1479048000000000 AS unixTime, 16 AS temp UNION ALL SELECT 105 AS inMins, 1479048300000000 AS unixTime, 17 AS temp UNION ALL SELECT 110 AS inMins, 1479048600000000 AS unixTime, 18 AS temp UNION ALL
SELECT 115 AS inMins, 1479048900000000 AS unixTime, 19 AS temp UNION ALL SELECT 120 AS inMins, 1479049200000000 AS unixTime, 20 AS temp UNION ALL SELECT 125 AS inMins, 1479049500000000 AS unixTime, 21 AS temp UNION ALL SELECT 130 AS inMins, 1479049800000000 AS unixTime, 22 AS temp UNION ALL SELECT 135 AS inMins, 1479050100000000 AS unixTime, 23 AS temp UNION ALL SELECT 140 AS inMins, 1479050400000000 AS unixTime, 24 AS temp UNION ALL SELECT 145 AS inMins, 1479050700000000 AS unixTime, 25 AS temp UNION ALL SELECT 150 AS inMins, 1479051000000000 AS unixTime, 26 AS temp UNION ALL SELECT 155 AS inMins, 1479051300000000 AS unixTime, 27 AS temp UNION ALL SELECT 160 AS inMins, 1479051600000000 AS unixTime, 28 AS temp UNION ALL SELECT 165 AS inMins, 1479051900000000 AS unixTime, 29 AS temp UNION ALL SELECT 170 AS inMins, 1479052200000000 AS unixTime, 30 AS temp UNION ALL SELECT 175 AS inMins, 1479052500000000 AS unixTime, 31 AS temp UNION ALL SELECT 180 AS inMins, 1479052800000000 AS unixTime, 32 AS temp UNION ALL SELECT 185 AS inMins, 1479053100000000 AS unixTime, 33 AS temp UNION ALL SELECT 190 AS inMins, 1479053400000000 AS unixTime, 34 AS temp UNION ALL SELECT 195 AS inMins, 1479053700000000 AS unixTime, 35 AS temp UNION ALL SELECT 200 AS inMins, 1479054000000000 AS unixTime, 36 AS temp UNION ALL SELECT 205 AS inMins, 1479054300000000 AS unixTime, 37 AS temp UNION ALL SELECT 210 AS inMins, 1479054600000000 AS unixTime, 38 AS temp
), y AS (
SELECT inMins, unixTime, temp, delta,
IFNULL(SUM(new_group_flag) OVER(ORDER BY unixTime ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 0) AS grp
FROM (
SELECT inMins, unixTime, temp,
LEAD(temp) OVER(ORDER BY unixTime) - temp AS delta,
CAST(SIGN(LEAD(temp) OVER(ORDER BY unixTime) - temp) != IFNULL(SIGN(temp - LAG(temp) OVER(ORDER BY unixTime)), SIGN(LEAD(temp) OVER(ORDER BY unixTime) - temp)) AS INT64) AS new_group_flag
FROM `esheetzbq.findingLocalExtrema.timeSeriesForKevin`
)
), yy AS (
SELECT inMins, unixTime, temp, delta, grp FROM y UNION ALL
SELECT inMins, unixTime, temp, delta, grp + 1 AS grp
FROM (
SELECT inMins, unixTime, temp, delta, grp,
unixTime - MAX(unixTime) OVER(PARTITION BY grp ORDER BY unixTime DESC) AS qq
FROM y
) WHERE qq = 0
), v AS (
SELECT inMins, unixTime, temp, delta,
MIN(temp) OVER(PARTITION BY grp ORDER BY unixTime RANGE BETWEEN CURRENT ROW AND 3600000000 FOLLOWING) AS min_temp,
MAX(temp) OVER(PARTITION BY grp ORDER BY unixTime RANGE BETWEEN CURRENT ROW AND 3600000000 FOLLOWING) AS max_temp
FROM yy
)
SELECT inMins, unixTime, temp, IF(delta=1, 1, 2) AS coldOrHot
FROM v
WHERE ABS(max_temp - temp) >= 5 OR ABS(min_temp - temp) >= 5
In case you will go this direction check also Enabling Standard SQL and Migrating from legacy SQL for more details if needed
Upvotes: 2