Reputation: 4557
I feel like I will get lots of downvotes here, but let's give it a go.
I am trying to explain nested loops vs hash vs merge join to my students on real examples. However, I am struggling to find tables that would join with nested loops (I tried many different sizes, indexes setups, etc.). Postgres always uses hash join regardless of the table sizes, indexes, etc.
Could someone give an example of tables (with data) that would join with nested loops without explicitly running set enable_hashjoin = true;
beforehand?
Upvotes: 2
Views: 56
Reputation: 247865
Create some tables:
CREATE TABLE a (
a_id integer PRIMARY KEY,
a_val text NOT NULL
);
CREATE TABLE b (
b_id integer PRIMARY KEY,
a_id integer REFERENCES a(a_id) NOT NULL,
b_val text NOT NULL
);
-- *never* forget an index on the foreign key column
CREATE INDEX ON b(a_id);
Add some sample data:
INSERT INTO a
SELECT i, 'value ' || i FROM generate_series(1, 1000) i;
INSERT INTO b
SELECT i, (i + 1) / 2, 'value ' || i FROM generate_series(1, 2000) i;
Analyze the tables to get good statistics:
ANALYZE a;
ANALYZE b;
Let's run a sample query:
EXPLAIN SELECT a.a_val, b.b_val FROM a JOIN b USING (a_id) WHERE a_id = 42;
QUERY PLAN
---------------------------------------------------------------------------
Nested Loop (cost=0.55..16.62 rows=2 width=19)
-> Index Scan using a_pkey on a (cost=0.28..8.29 rows=1 width=13)
Index Cond: (a_id = 42)
-> Index Scan using b_a_id_idx on b (cost=0.28..8.31 rows=2 width=14)
Index Cond: (a_id = 42)
(5 rows)
Upvotes: 1
Reputation:
The following does a nested loop for me (without disabling hashjoins) on Postgres 10.5
create table one (id integer primary key, some_ts timestamp, some_value integer);
insert into one values (1, clock_timestamp(), 42),(2, clock_timestamp(), 42);
create table two (id integer primary key, one_id integer not null references one, some_ts timestamp);
insert into two
select i, 1, clock_timestamp()
from generate_series(1,10) i;
insert into two
select i, 2, clock_timestamp()
from generate_series(11,20) i;
create index on two (one_id);
explain (analyze)
select one.*, two.id, two.some_ts
from one
join two on one.id = two.one_id
where one.id = 1;
Results in:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.15..4.23 rows=1 width=28) (actual time=0.029..0.033 rows=10 loops=1)
-> Index Scan using one_pkey on one (cost=0.15..3.16 rows=1 width=16) (actual time=0.016..0.016 rows=1 loops=1)
Index Cond: (id = 1)
-> Seq Scan on two (cost=0.00..1.07 rows=1 width=16) (actual time=0.011..0.014 rows=10 loops=1)
Filter: (one_id = 1)
Rows Removed by Filter: 10
Planning time: 0.130 ms
Execution time: 0.058 ms
Online example: http://rextester.com/CXZZ12304
Upvotes: 1