WonderLand
WonderLand

Reputation: 5674

SQL Join condition optimization

I had an issue with a SQL query that was slow and lead me to a 500 Internal Server Error.

After playing a bit with the query I had found out that moving join conditions outside makes the query work and the error vanish.

Now I wonder:

Here the queries.

Original (slow) SQL:

SELECT *
FROM `TABLE_1` AS `main_table`
  INNER JOIN `TABLE_2` AS `w`
    ON main_table.entity_id = w.product_id
  LEFT JOIN `TABLE_3` AS `ur`
    ON main_table.entity_id = ur.product_id 
       AND ur.category_id IS NULL 
       AND ur.store_id = 1 
       AND ur.is_system = 1
WHERE (w.website_id = '1')

Faster SQL:

SELECT *
FROM `TABLE_1` AS `main_table`
  INNER JOIN `TABLE_2` AS `w`
    ON main_table.entity_id = w.product_id
  LEFT JOIN `TABLE_3` AS `ur`
    ON main_table.entity_id = ur.product_id 
WHERE (w.website_id = '1') 
AND ur.category_id IS NULL 
AND ur.store_id = 1 
AND ur.is_system = 1

Upvotes: 2

Views: 360

Answers (1)

Dan
Dan

Reputation: 10680

Is the 2nd query equivalent to the 1st?

Depending on the data in TABLE_3, the 2nd query is NOT equivalent to the first, since you're using a LEFT JOIN.

Consider what happens when you have a record in TABLE_1, with an entity_id that does not match any rows in TABLE_3: Your first query would still return this record from TABLE_1, with NULL values for all the columns of TABLE_3. The 2nd query applies filters to the columns from TABLE_3, but since these are all NULLs, the record now gets filtered out.

In general, query 2 would thus return fewer records than query 1 (unless, of course, all your entity_id's match a product_id from TABLE_3).

What was my bug mistake, why it was so slow?

Both queries are valid SQL, so none of them should give you a 500 Internal Server Error. This does not even look like a MySQL error, so maybe the bug arose elsewhere? Perhaps you have a web application that is unable to handle the NULL's returned from the 1st query? It's impossible to answer the question without more details about the error.

As for the speed of the queries, this depends a lot on what indexes are defined. Generally, I would not expect the 2nd query to be faster than the 1st, but that depends. As I mentioned above, the 1st query might return more records than the 2nd. If you're not querying the database directly, but instead through some application, could it be the application slowing down your query?

Upvotes: 1

Related Questions