EM0
EM0

Reputation: 6327

PostgreSQL partitioning with joined table - partition constraint not used in query plan

I have a large table in PostgreSQL 9.2 that I've partitioned as described in the manual. Well... almost! My real partition key is not in the partitioned table itself, but in a joined table, like this (simplified):

-- millions to tens of millions of rows
CREATE TABLE data
(
  slice_id integer NOT NULL,
  point_id integer NOT NULL,
  -- ... data columns ...,
  CONSTRAINT pk_data PRIMARY KEY (slice_id, point_id),
  CONSTRAINT fk_data_slice FOREIGN KEY (slice_id) REFERENCES slice (id)
  CONSTRAINT fk_data_point FOREIGN KEY (point_id) REFERENCES point (id)
)

-- hundreds to thousands of rows
CREATE TABLE slice
(
  id serial NOT NULL,
  partition_date timestamp without time zone NOT NULL,
  other_date timestamp without time zone NOT NULL,
  int_key integer NOT NULL
  CONSTRAINT pk_slice PRIMARY KEY (id)
)

-- about 40,000 rows
CREATE TABLE point
(
  -- ... similar to "slice" ...
)

The table to be partitioned (data) contains rows for every combination of point and slice, each of which has a compound key. I want to partition it on only one of the key columns, partition_date, which is part of slice. Of course, the check constraints on my child tables cannot include that directly, so instead I include the range of all slice.id values corresponding to that partition_date, like this:

ALTER TABLE data_part_123 ADD CONSTRAINT ck_data_part_123
    CHECK (slice_id >= 1234 AND slice_id <= 1278);

This all works fine for inserting data. However, queries do not use the above CHECK constraint. Eg.

SELECT *
FROM data d
JOIN slice s ON d.slice_id = s.id
WHERE s.partition_date = '2013-07-23'

I can see in the query plan that this still scans all child tables. I've tried rewriting the query in several ways, including a CTE and a sub-select, but that hasn't helped.

Is there any way I can get the planner to "understand" my partitioning scheme? I don't really want to duplicate the partition key millions of times in the data table.

Query plan looks like this:

Aggregate  (cost=539243.88..539243.89 rows=1 width=0)
  ->  Hash Join  (cost=8.88..510714.02 rows=11411945 width=0)
        Hash Cond: (d.slice_id = s.id)
        ->  Append  (cost=0.00..322667.41 rows=19711542 width=4)
              ->  Seq Scan on data d  (cost=0.00..0.00 rows=1 width=4)
              ->  Seq Scan on data_part_123 d  (cost=0.00..135860.10 rows=8299610 width=4)
              ->  Seq Scan on data_part_456 d  (cost=0.00..186807.31 rows=11411931 width=4)
        ->  Hash  (cost=7.09..7.09 rows=143 width=4)
              ->  Seq Scan on slice s  (cost=0.00..7.09 rows=143 width=4)
                    Filter: (partition_date = '2013-07-23 00:00:00'::timestamp without time zone)

Upvotes: 6

Views: 3762

Answers (2)

Clodoaldo Neto
Clodoaldo Neto

Reputation: 125204

The only way to achieve it is to make the query dynamic:

create function select_from_data (p_date date)
returns setof data as $function$

declare
    min_slice_id integer,
    max_slice_id integer;

begin
    select min(slice_id), max(slice_id)
    into min_slice_id, max_slice_id
    from slice
    where partition_date = p_date;

return query execute
    $dynamic$
        select *
        from data
        where slice_id between $1 and $2
    $dynamic$
    using min_slice_id, max_slice_id;

end;
$function$ language plpgsql;

This will build the query with the appropriate slice range for the given date and will plan it at run time when the planner will have the information it needs to check for the exact partitions.

To make the function more generic without loosing the planner's ability to get information at running time use the or parameter is null construct in the filter.

create function select_from_data (
    p_date date,
    value_1 integer default null,
    value_2 integer default null
)
returns setof data as $function$

declare
    min_slice_id integer,
    max_slice_id integer;

begin
    select min(slice_id), max(slice_id)
    into min_slice_id, max_slice_id
    from slice
    where partition_date = p_date;

return query execute
    $dynamic$
        select *
        from data
        where
            slice_id between $1 and $2
            and (some_col = $3 or $3 is null)
            and (another_col = $4 or $4 is null)
    $dynamic$
    using min_slice_id, max_slice_id, value_1, value_2;

end;
$function$ language plpgsql;

Now if some parameter was passed as null it will not interfere with the query.

Upvotes: 6

Craig Ringer
Craig Ringer

Reputation: 324285

This scheme just isn't going to work. constraint_exclusion is simple and dumb. It must be able to prove by examining a query during planning that the query cannot touch certain partitions in order to exclude them.

Excluding partitions during query execution isn't supported at this time. There's a lot of room for improvement in the rudimentary partitioning support Pg offers, and execution-time constraint exclusion is just one of the areas that could use work.

Your application will need to know about the partitions and their constraints, and will need to explicitly join on the union of only the partition(s) that are required.

In this case I'm not sure how PostgreSQL could even do what you want. I guess you want it to project the constraint through the composite key on the join, asserting that since the query specifies s.partition_date = '2013-07-23' and a query for all slice IDs with s.partition_date = '2013-07-23' finds them in the range slice_id >= 1234 AND slice_id <= 1278 then only partition data_part_123 should be scanned.

The trouble is that at planning time PostgreSQL has absolutely no idea that s.partition_date = '2013-07-23 corresponds to a particular range of slice IDs. It might be able to figure it out from correlation stats if it kept them, but table stats are only approximations, not the proof that's required for partitioning.

I suspect you will need to denormalize your data a little, duplicating slice.partition_date in each data row if you wish to partition by it. You can either try to make sure not to get them out of sync, or (what I'd do) create a UNIQUE constraint on slice(id, partition_date) then add a FOREIGN KEY reference from the partitions of data into slice, thus making sure that they cannot get out of sync at the cost of some additional index maintenance and insertion costs.

Upvotes: 4

Related Questions