Reputation: 6394
Query A executes in microseconds:
SELECT t1.id
FROM (SELECT t0.id AS id FROM t0) AS t1
WHERE NOT (EXISTS (SELECT 1
FROM t2
WHERE t2.ph_id = t1.id AND t2.me_id = 1 AND t2.rt_id = 4))
LIMIT 20 OFFSET 0
But query B takes about 25 seconds:
SELECT t1.id, count(*) OVER () AS count
FROM (SELECT t0.id AS id FROM t0) AS t1
WHERE NOT (EXISTS (SELECT 1
FROM t2
WHERE t2.ph_id = t1.id AND t2.me_id = 1 AND t2.rt_id = 4))
LIMIT 20 OFFSET 0
(the difference is just one item in the select clause - a window aggregate)
EXPLAIN output is as follows, for A:
Limit (cost=0.00..1.20 rows=20 width=4)
-> Nested Loop Anti Join (cost=0.00..3449.22 rows=57287 width=4)
Join Filter: (t2.ph_id = t0.id)
-> Seq Scan on t0 (cost=0.00..1323.88 rows=57288 width=4)
-> Materialize (cost=0.00..1266.02 rows=1 width=4)
-> Seq Scan on t2 (cost=0.00..1266.01 rows=1 width=4)
Filter: ((me_id = 1) AND (rt_id = 4))
And for B:
Limit (cost=0.00..1.45 rows=20 width=4)
-> WindowAgg (cost=0.00..4165.31 rows=57287 width=4)
-> Nested Loop Anti Join (cost=0.00..3449.22 rows=57287 width=4)
Join Filter: (t2.ph_id = t0.id)
-> Seq Scan on t0 (cost=0.00..1323.88 rows=57288 width=4)
-> Materialize (cost=0.00..1266.02 rows=1 width=4)
-> Seq Scan on t2 (cost=0.00..1266.01 rows=1 width=4)
Filter: ((me_id = 1) AND (rt_id = 4))
I am adding the window aggregate to get the total number of rows before LIMITing, for the purpose of building a paging UI.
Upvotes: 0
Views: 4141
Reputation: 6394
Thank you for the other answers, which are correct in that the counting has to do more work, but I found the solution from another source. The statistics were not up to date.
After running the command...:
ANALYZE;
... Postgresql was able to choose a more appropriate query plan, and now both queries run very fast.
Upvotes: 1
Reputation: 6969
Can you check how long COUN(*) takes and what execution plan looks like:
SELECT count(*)
FROM (SELECT t0.id AS id FROM t0) AS t1
WHERE NOT (EXISTS (SELECT 1
FROM t2
WHERE t2.ph_id = t1.id AND t2.me_id = 1 AND t2.rt_id = 4))
This might give you an idea why it takes longer.
Basically first query only reads 20 first records that match the criteria from t0, while second query has to generate the full set of records matching the criteria to count them.
Upvotes: 3
Reputation: 1269873
Your original query can be written like this:
SELECT t0.id
FROM t0
WHERE NOT EXISTS (SELECT 1
FROM t2
WHERE t2.ph_id = t1.id AND t2.me_id = 1 AND t2.rt_id = 4
)
LIMIT 20 OFFSET 0;
You have no order by
, so the query can start returning results as they are found for the result set. When you add the window function:
SELECT t.0.id, count(*) over ()
Now it is counting the number of rows in the result set, so it has to generate the entire result set. Hence, instead of just getting the first twenty rows, the query has to generate all of them. That takes more time.
Upvotes: 3