Reputation: 99
The following is the EXPLAIN output of a query with set enable_seqscan = true.
Hash Join (cost=1028288.04..278841855100.04 rows=429471108 width=125)
Hash Cond: ((u.destination)::text = (n.mid)::text)
-> Nested Loop (cost=0.00..278587474234.17 rows=429471108 width=112)
Join Filter: (((u.destination)::text <> (u2.mid)::text) AND ("position"((u2.path_name)::text, (suffix(u.path_name))::text) = 0) AND (((prefix((u.path_name)::text))::text = (prefix((u2.path_name)::text))::text) OR ((prefix((u.path_name)::text))::text = 'common'::text)))
-> Seq Scan on unresolved u2 (cost=0.00..2780546.32 rows=117608632 width=79)
-> Index Scan using unresolved__mid on unresolved u (cost=0.00..1864.44 rows=492 width=53)
Index Cond: ((u.mid)::text = (u2.destination)::text)
-> Hash (cost=488335.24..488335.24 rows=27237024 width=33)
-> Seq Scan on name n (cost=0.00..488335.24 rows=27237024 width=33)
(9 rows)
The following is the EXPLAIN output of the same query but with set enable_seqscan = false.
Hash Join (cost=102089128.45..279381508122.13 rows=429471108 width=125)
Hash Cond: ((u.destination)::text = (n.mid)::text)
-> Nested Loop (cost=0.00..279026066415.86 rows=429471108 width=112)
Join Filter: (((u.destination)::text <> (u2.mid)::text) AND ("position"((u2.path_name)::text, (suffix(u.path_name))::text) = 0) AND (((prefix((u.path_name)::text))::text = (prefix((u2.path_name)::text))::text) OR ((prefix((u.path_name)::text))::text = 'common'::text)))
-> Index Scan using unresolved__destination on unresolved u2 (cost=0.00..441372728.01 rows=117608632 width=79)
-> Index Scan using unresolved__mid on unresolved u (cost=0.00..1864.44 rows=492 width=53)
Index Cond: ((u.mid)::text = (u2.destination)::text)
-> Hash (cost=101549175.65..101549175.65 rows=27237024 width=33)
-> Index Scan using name_pkey on name n (cost=0.00..101549175.65 rows=27237024 width=33)
(9 rows)
I would like to know how long would the query take. It's been running for about 10 hours now. Is the estimated time deduced from the 'cost' in the first row, in the case of the latter it is '279381508122.13 ms' which is 8.8 years?! :-(
Upvotes: 1
Views: 1175
Reputation: 16417
The numbers do not correspond to time. They are relative numbers only. From the documentation (Using Explain):
The costs are measured in arbitrary units determined by the planner's cost parameters (see Section 18.6.2). Traditional practice is to measure the costs in units of disk page fetches; that is, seq_page_cost is conventionally set to 1.0 and the other cost parameters are set relative to that. (The examples in this section are run with the default cost parameters.)
In any event, the Nested Loop due to a somewhat vague join condition appears to be killing your performance. Hard to tell without seeing the original query and table/index structures, but you might find benefit in creating a functional index on unresolved, assuming "prefix()" is an IMMUTABLE function:
CREATE INDEX idx_path_name_prefix ON unresolved (prefix(path_name));
Upvotes: 1