Chad Showalter
Chad Showalter

Reputation: 401

How to write join queries on TimescaleDB hypertables that don't scan chunks outside a given time range

Consider the following set of tables:

create table devices(
    id integer,
    primary key(id)
);

create table metrics(
                        ts timestamp,
                        id integer,
                        power double precision,
                        primary key (ts, id)
);
select create_hypertable('metrics', 'ts', chunk_time_interval := interval '1 days');

create table forecast(
                         ts timestamp,
                         id integer,
                         forecast_power double precision,
                         primary key (ts, id)
);
select create_hypertable('forecast', 'ts', chunk_time_interval := interval '1 days');

create table default_forecast(
                                 ts timestamp,
                                 default_forecast_power double precision,
                                 primary key(ts)
);
select create_hypertable('default_forecast', 'ts', chunk_time_interval := interval '1 days');

and this view:

create or replace view vw_forecast as
(
select d.id,
       df.ts,
       coalesce(f.forecast_power, df.default_forecast_power) as forecast_power
from devices d
         join default_forecast df on true
         left join forecast f on df.ts = f.ts and f.id = d.id
    );

adding some data:

insert into metrics(ts, id, power)
select times.ts,
       d.id,
       5
from devices d,
     generate_series('2000-01-01 00:00:00'::timestamp, '2001-01-01 00:00:00'::timestamp,
                     '5 minutes'::interval) as times(ts);

insert into forecast(ts, id, forecast_power)
select times.ts,
       d.id,
       6
from devices d,
     generate_series('2000-01-01 00:00:00'::timestamp, '2001-01-01 00:00:00'::timestamp,
                     '2 hour'::interval) as times(ts);

insert into default_forecast(ts, default_forecast_power)
select times.ts,
       7
from generate_series('2000-01-01 00:00:00'::timestamp, '2001-01-01 00:00:00'::timestamp,
                     '1 hour'::interval) as times(ts);

and executing this query:

explain(analyze, format json)
select *
from metrics m
         join vw_forecast f on f.id = m.id and f.ts = time_bucket('1 hour', m.ts)
where m.ts between '2000-01-10 00:00:00' and '2000-01-11 00:00:00';

I observe that the query plan results in a sequential scan on both the forecast and default_forecast tables. I realize that I could expose the timestamp columns from both the forecast and the default_forecast tables in vw_forecast and additionally add constraints in the where clause of my query over these timestamp columns which would eliminate the sequential scan. However, I would like to use this view in multiple contexts, and not burden consumers of the view who join against it with adding multiple constraints in order to get reasonable performance. Is there a way to accomplish my goal?

Upvotes: 1

Views: 618

Answers (1)

jonatasdp
jonatasdp

Reputation: 1412

Can you try to add some extra where clause limiting the vw_forecast table too?

Some new fragment like:

and f.ts between '2000-01-10 00:00:00' and '2000-01-11 00:00:00';

Looking the full view with explain:

explain(analyze)
select *
from metrics m
         join vw_forecast f on f.id = m.id and f.ts = time_bucket('1 hour', m.ts)
where m.ts between '2000-01-10 00:00:00' and '2000-01-11 00:00:00' 
and f.ts between '2000-01-10 00:00:00' and '2000-01-11 00:00:00';
...
│ Planning Time: 79.190 ms                                                                                                                                                                               │
│ Execution Time: 273.094 ms

Comparing without the extra where clause:

explain(analyze)
select *
from metrics m
         join vw_forecast f on f.id = m.id and f.ts = time_bucket('1 hour', m.ts)
where m.ts between '2000-01-10 00:00:00' and '2000-01-11 00:00:00';
...
│ Planning Time: 207.309 ms                                                                                                                                                                              │
│ Execution Time: 414.093 ms

Note that the where clause is on your query not attached to your view directly.

Upvotes: 1

Related Questions