Reputation: 59175
We are trying to implement a peak detection algorithm using BigQuery. Basically we want to track if there is a threshold exceedance, and then just output the start and end time.
For example, with GSOD weather data pick an arbitrary temperature threshold. Basically for a given GSOD location, find the dates where it is above 70 degrees but don't count it as an event until it drops back below 70 degrees.
Upvotes: 1
Views: 480
Reputation: 59175
LAG() and LEAD() will help us peek at the next and previous days temperature. We'll know an event has started when the previous day is below 70, and the next day is over it. Same, but reversed, to detect the end.
SELECT day, prevday, temp, nextday,
IF((temp>prevday and prevday <70), 'start', 'end') period
FROM (
SELECT day, temp,
LEAD(temp) OVER(ORDER BY day) nextday,
LAG(temp) OVER(ORDER BY day) prevday
FROM (
SELECT year*10000+month*100+day day, mean_temp temp
FROM [bigquery-samples:weather_geo.gsod]
WHERE station_number = 8404
)
)
WHERE (temp > 70 and prevday < 70 and nextday > 70)
OR (nextday < 70 and temp > 70 and prevday>70)
Results:
Row day prevday temp nextday period
1 20091009 77.0 74.0 68.7 end
2 20091013 69.0 72.0 74.8 start
3 20091016 73.2 70.6 68.9 end
4 20091029 69.2 72.7 75.3 start
5 20091106 73.8 72.7 67.6 end
...
(2.8s elapsed, 4.53 GB processed)
Upvotes: 1