Reputation: 15416
I have a query:
SELECT study."id"
FROM study
JOIN report ON (report."studyId" = study."id")
WHERE
study.facts->'patientName'->>'value' = 'HELLO WORLD' OR
report.variables->'patientName'->>'value' = 'HELLO WORLD'
All tables have indexes.
Why is this query taking 4.5s across 6000 rows? Explain analyze output below:
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Hash Join (cost=383.69..1403.67 rows=39 width=52) (actual time=2734.257..2734.260 rows=0 loops=1) |
| Hash Cond: ((study.id)::text = (report."studyId")::text) |
| Join Filter: ((((study.facts -> 'patientName'::text) ->> 'value'::text) = 'HELLO WORLD'::text) OR (((report.variables -> 'patientName'::text) ->> 'value'::text) = 'HELLO WORLD'::text)) |
| Rows Removed by Join Filter: 7453 |
| -> Seq Scan on study (cost=0.00..1000.23 rows=7523 width=70) (actual time=0.020..13.548 rows=7523 loops=1) |
| -> Hash (cost=290.53..290.53 rows=7453 width=70) (actual time=5.052..5.053 rows=7453 loops=1) |
| Buckets: 8192 Batches: 1 Memory Usage: 808kB |
| -> Seq Scan on report (cost=0.00..290.53 rows=7453 width=70) (actual time=0.014..3.235 rows=7453 loops=1) |
| Planning Time: 0.896 ms |
| Execution Time: 2734.323 ms |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I have a UNION query that does the same thing, but is much quicker (0.001s). I want to know more about why my JOIN query is so much slower:
SELECT id::text
FROM study
WHERE study.facts->'patientName'->>'value' = 'HELLO WORLD'
UNION
SELECT report."studyId"::text
FROM report
WHERE report.variables->'patientName'->>'value' = 'HELLO WORLD';
+-------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|-------------------------------------------------------------------------------------------------------------------------------------------------|
| HashAggregate (cost=143.12..143.51 rows=39 width=32) (actual time=0.040..0.041 rows=0 loops=1) |
| Group Key: ((study.id)::text) |
| -> Append (cost=4.58..143.02 rows=39 width=32) (actual time=0.038..0.039 rows=0 loops=1) |
| -> Bitmap Heap Scan on study (cost=4.58..134.14 rows=38 width=32) (actual time=0.026..0.026 rows=0 loops=1) |
| Recheck Cond: (((facts -> 'patientName'::text) ->> 'value'::text) = 'HELLO WORLD'::text) |
| -> Bitmap Index Scan on "IDX_facts_patientName" (cost=0.00..4.57 rows=38 width=0) (actual time=0.023..0.023 rows=0 loops=1) |
| Index Cond: (((facts -> 'patientName'::text) ->> 'value'::text) = 'HELLO WORLD'::text) |
| -> Index Scan using "IDX_variables_patientName" on report (cost=0.28..8.30 rows=1 width=32) (actual time=0.012..0.012 rows=0 loops=1) |
| Index Cond: (((variables -> 'patientName'::text) ->> 'value'::text) = 'HELLO WORLD'::text) |
| Planning Time: 0.560 ms |
| Execution Time: 0.103 ms |
+-------------------------------------------------------------------------------------------------------------------------------------------------+
Upvotes: 1
Views: 87
Reputation: 44240
OR
is always difficult for the optimiser(s). Since you dont need any fields from the report
table, you could tuck that away into an EXISTS() subquery, this will probably result in a bitmap-index scan.:
SELECT s."id"
FROM study s
WHERE s.facts->'patientName'->>'value' = 'HELLO WORLD'
OR EXISTS (
SELECT * FROM report r
WHERE r."studyId" = s."id"
AND r.variables->'patientName'->>'value' = 'HELLO WORLD'
);
Upvotes: 2
Reputation: 437
JOINS and UNIONS are two completely different operations.
JOIN adds columns from both tables to the result set, matching both tables with you matching criteria ( WHERE CLAUSE
)
UNION on the others hand, append the result of 1 table on top of the other. In more detail, the difference between an UNION
and a UNION ALL
, is that the union is a SELECT DISTINCT
from the UNION ALL
In summary, the UNION
does not require a matching criteria to select the ROWS
to add to the result.
Upvotes: 2