Randomize
Randomize

Reputation: 9103

How can I fix timeseries with TimescaleDB?

I have data like this:

Month_event | No_people | Cost
2017-03-01  | 78        | 120000 
2017-01-01  | 67        | 220000 
2017-07-01  | 121       | 320000 
2017-04-01  | 70        | 100000 

and what I normally do from my code is using a SQL query (windowed) in Postgresql to add the missing values in the timeseries (copy over the value from the month before):

Month_event | No_people | Cost
2017-01-01  | 67        | 220000 
2017-02-01  | 67        | 220000 
2017-03-01  | 78        | 120000 
2017-04-01  | 70        | 100000 
2017-05-01  | 70        | 100000  
2017-06-01  | 70        | 100000 
2017-07-01  | 121       | 320000 

This is my usual query:

WITH 
calendar AS (
    SELECT interval_date::date FROM generate_series('2005-01-01'::date, (select release_month from mtd), '1 month'::interval) interval_date
),
m AS (
    SELECT *, LEAD(monthly_event) OVER (ORDER BY monthly_event) AS next_date
    FROM my_data
)
SELECT *
FROM calendar c
JOIN m
    ON c.interval_date BETWEEN m.monthly_date AND
    (CASE WHEN m.next_date IS NULL THEN date_trunc('month', current_date) ELSE m.next_date - '1 month'::interval END);

As Postgres extension, I can reuse the same query on TimescaleDB I suppose. I am wondering if there is better performing solution for TimescaleDB as I cannot figure it out from the documentation.

Upvotes: 1

Views: 428

Answers (1)

davidk
davidk

Reputation: 1053

There are some new functions for that in TimescaleDB 1.2, see: https://blog.timescale.com/sql-functions-for-time-series-analysis/ The gap filling stuff should do what you're looking for much more easily.

Upvotes: 2

Related Questions