Reputation: 331
I just noticed Postgres (checked on version 13 and 14) behavior that surprised me. I have a simple table volume
with id
and unique text column name
. Second table dir
has 3 columns: id
, volume_id
and path
. This table is partitioned by hash on volume_id
column. Here is full table schema with sample data:
CREATE TABLE dir (
id BIGSERIAL,
volume_id BIGINT,
path TEXT
) PARTITION BY HASH (volume_id);
CREATE TABLE dir_0
PARTITION OF dir FOR VALUES WITH (modulus 3, remainder 0);
CREATE TABLE dir_1
PARTITION OF dir FOR VALUES WITH (modulus 3, remainder 1);
CREATE TABLE dir_2
PARTITION OF dir FOR VALUES WITH (modulus 3, remainder 2);
CREATE TABLE volume(
id BIGINT,
name TEXT UNIQUE
);
INSERT INTO volume (id, name) VALUES (1, 'vol1'), (2, 'vol2'), (3, 'vol3');
INSERT INTO dir (volume_id, path) SELECT i % 3 + 1, 'path_' || i FROM generate_series(1,1000) AS i;
Now, given the volume name, I need to find all the rows from the dir
table on that volume. I can do that in 2 different ways.
Query #1
EXPLAIN ANALYZE
SELECT * FROM dir AS d
INNER JOIN volume AS v ON d.volume_id = v.id
WHERE v.name = 'vol1';
Which produces query plan:
QUERY PLAN |
---|
Hash Join (cost=1.05..31.38 rows=333 width=37) (actual time=0.186..0.302 rows=333 loops=1) |
Hash Cond: (d.volume_id = v.id) |
-> Append (cost=0.00..24.00 rows=1000 width=24) (actual time=0.006..0.154 rows=1000 loops=1) |
-> Seq Scan on dir_0 d_1 (cost=0.00..6.34 rows=334 width=24) (actual time=0.006..0.032 rows=334 loops=1) |
-> Seq Scan on dir_1 d_2 (cost=0.00..6.33 rows=333 width=24) (actual time=0.006..0.029 rows=333 loops=1) |
-> Seq Scan on dir_2 d_3 (cost=0.00..6.33 rows=333 width=24) (actual time=0.004..0.026 rows=333 loops=1) |
-> Hash (cost=1.04..1.04 rows=1 width=13) (actual time=0.007..0.007 rows=1 loops=1) |
Buckets: 1024 Batches: 1 Memory Usage: 9kB |
-> Seq Scan on volume v (cost=0.00..1.04 rows=1 width=13) (actual time=0.003..0.004 rows=1 loops=1) |
Filter: (name = 'vol1'::text) |
Rows Removed by Filter: 2 |
Planning Time: 0.500 ms |
Execution Time: 0.364 ms |
As you can see this query leads to a sequential scan on all 3 partitions of the dir
table.
Alternatively, we can write this query like this:
Query #2
EXPLAIN ANALYZE
SELECT * FROM dir AS d
WHERE volume_id = (SELECT id FROM volume AS v WHERE v.name = 'vol1');
In that case we get following query plan:
QUERY PLAN |
---|
Append (cost=1.04..27.54 rows=1000 width=24) (actual time=0.010..0.066 rows=333 loops=1) |
InitPlan 1 (returns $0) |
-> Seq Scan on volume v (cost=0.00..1.04 rows=1 width=8) (actual time=0.003..0.004 rows=1 loops=1) |
Filter: (name = 'vol1'::text) |
Rows Removed by Filter: 2 |
-> Seq Scan on dir_0 d_1 (cost=0.00..7.17 rows=334 width=24) (never executed) |
Filter: (volume_id = $0) |
-> Seq Scan on dir_1 d_2 (cost=0.00..7.16 rows=333 width=24) (never executed) |
Filter: (volume_id = $0) |
-> Seq Scan on dir_2 d_3 (cost=0.00..7.16 rows=333 width=24) (actual time=0.004..0.037 rows=333 loops=1) |
Filter: (volume_id = $0) |
Planning Time: 0.063 ms |
Execution Time: 0.093 ms |
Here we can see that partitions dir_0
and dir_1
have never executed
annotation.
My question is:
Why in the first case is there no partition pruning? Postgres already knows that the volume.name
column is unique and that it will translate into a single volume_id
. I would like to get a good intuition on when partition pruning can happen during query execution.
Upvotes: 0
Views: 954
Reputation: 246073
To get partition pruning with a hash join, you'd need to add a condition on d.volume_id
to your query. No inference is made from the join with volume
.
Your second query shows partition pruning; the "never executed" means that the query executor pruned the scan of certain partitions.
An alternative method that forces a nested loop join and prunes partitions should be
SELECT *
FROM volume AS v
CROSS JOIN LATERAL (SELECT * FROM dir
WHERE dir.volume_id = v.id
OFFSET 0) AS d
WHERE v.name = 'vol1';
OFFSET 0
prevents anything except a nested loop join. Different from your query, that should also work if volumn.name
is not unique.
Upvotes: 0