Reputation: 1077
I have the following query that has 6 joins
EXPLAIN ANALYZE
SELECT
property.id,
full_address,
street_address,
street.street,
city.city as city,
state.state_code as state_code,
zipcode.zipcode as zipcode,
property_tax_history.tax AS property_tax,
property_tax_history.land AS land_value,
property_tax_history.improvements AS improvements_value,
property_tax_history.year AS tax_year,
property_sale_history.date_event AS event_date,
property_sale_history.event AS event,
property_sale_history.price AS event_price
FROM
property
INNER JOIN
street
ON street.id = property.street_id
INNER JOIN
city
ON city.id = property.city_id
INNER JOIN
state
ON state.id = property.state_id
INNER JOIN
zipcode
ON zipcode.id = property.zipcode_id
LEFT JOIN
property_sale_history
ON property_sale_history.property_id = property.id
LEFT JOIN
property_tax_history
ON property_tax_history.property_id = property.id
WHERE
full_address = ?;
Below are the EXPLAIN ANALYZE results
Nested Loop Left Join (cost=2.83..296.67 rows=6 width=227) (actual time=0.606..0.766 rows=28 loops=1)
-> Nested Loop Left Join (cost=2.26..62.52 rows=1 width=299) (actual time=0.544..0.551 rows=4 loops=1)
-> Nested Loop (cost=1.83..41.98 rows=1 width=284) (actual time=0.522..0.525 rows=1 loops=1)
-> Nested Loop (cost=1.54..33.67 rows=1 width=294) (actual time=0.482..0.485 rows=1 loops=1)
-> Nested Loop (cost=1.27..25.32 rows=1 width=307) (actual time=0.439..0.441 rows=1 loops=1)
-> Nested Loop (cost=0.98..17.02 rows=1 width=314) (actual time=0.392..0.394 rows=1 loops=1)
-> Index Scan using property_full_address on property (cost=0.56..8.57 rows=1 width=318) (actual time=0.339..0.340 rows=1 loops=1)
Index Cond: (full_address = '10951097-4Th-Ave-Chula-Vista-CA-91911'::citext)
-> Index Scan using street_pkey on street (cost=0.42..8.44 rows=1 width=28) (actual time=0.046..0.046 rows=1 loops=1)
Index Cond: (id = property.street_id)
-> Index Scan using city_id_pk on city (cost=0.29..8.30 rows=1 width=25) (actual time=0.044..0.044 rows=1 loops=1)
Index Cond: (id = property.city_id)
-> Index Scan using state_id_pk on state (cost=0.28..8.32 rows=1 width=19) (actual time=0.041..0.041 rows=1 loops=1)
Index Cond: (id = property.state_id)
-> Index Scan using zipcode_id_pk on zipcode (cost=0.29..8.30 rows=1 width=22) (actual time=0.037..0.037 rows=1 loops=1)
Index Cond: (id = property.zipcode_id)
-> Index Scan using property_sale_history_property_id on property_sale_history (cost=0.43..20.50 rows=4 width=31) (actual time=0.019..0.021 rows=4 loops=1)
Index Cond: (property_id = property.id)
-> Index Scan using property_tax_history_property_id on property_tax_history (cost=0.56..233.54 rows=57 width=33) (actual time=0.014..0.021 rows=7 loops=4)
Index Cond: (property_id = property.id)
Planning Time: 9.805 ms
Execution Time: 1.628 ms
With so many joins, I can see each one takes some time, and they add up. Is there anyway I can modify the query to speed things up? Would using a SSD drive help?
Upvotes: 0
Views: 52
Reputation: 425033
The problem is here:
LEFT JOIN property_sale_history ON property_sale_history.property_id = property.id
LEFT JOIN property_tax_history ON property_tax_history.property_id = property.id
There are independent joins to each table, and columns for both joins are in the result.
This creates an n2 problem with the number of rows returned. eg if there are 3 sale rows and 4 tax rows for a given property, there will be 12 rows for that property in the result. This is going to really hurt performance, and there’s no way to “fix” the performance problem, and such a query doesn’t make sense anyway.
You should use two separate queries - one for sale history and one for tax history.
You may consider creating a view for the common part of the two queries.
——
p.s. easing up on the new lines and left justifying everything after FROM will help readability.
Upvotes: 1