pilcrow
pilcrow

Reputation: 58589

Query efficiency of timestamp-partitioned tables using CURRENT_TIMESTAMP

Given the following table partitioning under PostgreSQL 9.0.3:

CREATE TABLE records (
  ts TIMESTAMP,
  ...
);

CREATE TABLE records_2010 (
  CHECK (ts >= '2010-01-01 00:00:00' AND ts < '2011-01-01 00:00:00')
) INHERITS (records);

CREATE TABLE records_2011 (
  CHECK (ts >= '2011-01-01 00:00:00' AND ts < '2012-01-01 00:00:00')
) INHERITS (records);

I expected the following SELECT queries to have the same EXPLAINed plan, consulting only "records" and "records_2011", but they differ:

BEGIN;
-- Assume CURRENT_TIMESTAMP is 9 a.m. on 5 March 2011
SELECT * FROM records WHERE ts >= '2011-03-05 09:00:00'; -- scans 2 tables
SELECT * FROM records WHERE ts >= CURRENT_TIMESTAMP;     -- scans all 3 tables
COMMIT;

Given that CURRENT_TIMESTAMP returns a constant value for the duration of its enclosing transactions, why doesn't the query with CURRENT_TIMESTAMP take advantage of Postgres' partitioning and only scan two tables?

UPDATE:

This isn't possible right now, but it is recognized as an area to improve. PostgreSQL 9.1 may address this behavior in the query executor.

Upvotes: 1

Views: 2851

Answers (1)

Andomar
Andomar

Reputation: 238086

If you run a query for the first time, PostgreSQL determines a query plan. This is an expensive operation and the result is cached. So the query has to work for future executions as well.

Your first query will never need records_2010, regardless of when you run it.

But the second query uses a variable CURRENT_TIMESTAMP. The optimizer does not know that time can only increase, and generates a plan that will work for any value of ts. That means it has to look in all three tables.

Upvotes: 5

Related Questions