Reputation: 162
I have a very simple query:
SELECT count(*) FROM link_list ll
WHERE ll.tsvector @@ to_tsquery('english', 'advertising|slogan|publicity|ads|advertise|brand|promo|ad|propaganda|sales|promote|logo|hype|commercial|consumerism|pamphlet|viral|salesperson|selling|blurb|advert|merchandise|marking|products|campaign|packaging|billboard|advertisement|promotion|questionnaire|marketing')
It runs fast (Here's the output from EXPLAIN ANALYZE). It uses the GIN index, and works exactly as expected.
Life is good.
But now, let me tweak the query slightly, and now it takes 2x as long!
WITH query AS
(
SELECT to_tsquery('english',('advertising|slogan|publicity|ads|advertise|brand|promo|ad|propaganda|sales|promote|logo|hype|commercial|consumerism|pamphlet|viral|salesperson|selling|blurb|advert|merchandise|marking|products|campaign|packaging|billboard|advertisement|promotion|questionnaire|marketing')) query
)
SELECT count(*) FROM link_list ll
WHERE ll.tsvector @@ (SELECT query FROM query);
(Output from EXPLAIN ANALYZE)
I would just use the first query... but unfortunately, the list of synonyms has to be dynamically generated, which I pull from another table.
For some strange reason, putting the tsquery
inside WITH
makes Postgresql not use the index as efficiently (It thinks it'll be a quick and dirty job that doesn't need an index, and it ends up being dead wrong).
Why in the world is this happening??
Upvotes: 0
Views: 130
Reputation: 44305
Neither one of those executions seems all that fast.
Your use of the WITH inhibits the parallel plan. If your bottleneck is IO (which seems likely) you can get parallel IO without parallel query by setting effective_io_concurrency to a value > 1.
The time spent setting up JIT is over 10% of the faster plan, and is probably a complete waste. You can set jit = off
(or turn it off globally in the .conf file) to spare that time.
Your recheck of lossy blocks also wastes time. You should increase work_mem to get rid of those. But the waste is of mostly CPU time, so the effect will be small if the bottleneck is IO, not CPU. It still has to visit the same set of blocks. (An exception to this is if TOAST is heavily used, then rows not rechecked don't need to be assembled from TOAST, so those TOAST block reads are avoided.)
Upvotes: 1