Rainer Plumer
Rainer Plumer

Reputation: 3753

MySql Join condition with Where condition, execution order and performance

Im writing a large query, which must query data from many tables. ( 11 tables )

I'm rewriting a script, which has huge performance issues. The script makes thousands( sometimes tens of thousands ) of small queries instead of few large ones. The query can take hours to finish, and can consume massive amounts of memory. The way it works at the moment is this. 1) Query matching events from db 2) i.e 500 rows were returned. 3) loop over the rows, and fetch additional data from each associated table

   foreach(rows as row) {
      row.user = /* query db for user data */
      row.products = /* query db for product data */
      row.productMeta = /* query db for additional data */
      ... and so on, for 11 tables
   }

So, the 500 rows will create 500 x 11 queries.

My goal is to rewrite the script, in a way that it joins and evaluates all the conditions on MySql server side, in 1-2 large queries.

Here is the question. What is the most efficient way to make this large query. Does it make a difference, if i evaluate conditions in Join ON clause, compared to outer WHERE clause.

i.e

SELECT * FROM events
  INNER JOIN users ON users.event_id = events.id AND <condition 1>
  INNER JOIN products ON products.event_id = events.id AND <condition 2>
  LEFT JOIN inventory ON inventory.event_id = events.id AND <condition 3>

VS

SELECT * FROM events
  INNER JOIN users ON users.event_id = events.id
  INNER JOIN products ON products.event_id = events.id
  LEFT JOIN inventory ON inventory.event_id = events.id
WHERE 
  <condition 1> 
  AND <condition 2> 
  AND <condition 3>

VS 

making a few more queries to pre-filter the events, then join 

$userIds = "make separate join to return event ids where users.event_id = events.id AND <condition 1>"
$productIds = "make separate join to return event ids where products.event_id = events.id AND <condition 2>"

Then:

SELECT * FROM events
  INNER JOIN users ON users.event_id = events.id
  INNER JOIN products ON products.event_id = events.id
  LEFT JOIN inventory ON inventory.event_id = events.id
WHERE 
  events.id IN ($userIds) OR events.id IN ($productIds)

PS: All related columns are indexed

Upvotes: 1

Views: 1440

Answers (1)

Rick James
Rick James

Reputation: 142296

Yes, you are right to avoid 500*11 queries.

For JOIN, it does not matter whether you put in the ON or the WHERE. However it is "proper" to put it in WHERE.

For LEFT JOIN, it does matter. So, stick with the principle that ON says only how the tables are related, and WHERE filters.

IN ( list ) sometimes hobbles the Optimizer's ability to pick the best index. So, tentatively don't use that approach.

OR is almost always bad for performance. Avoid it whenever possible. One workaround is to use UNION.

If there is a GROUP BY you failed to mention, there could be some other issues.

Bottom line: Option 2 seems to be the best.

Regardless of what the final query looks like, you probably need to revisit the index(es) on the tables. See my cookbook .

You have given only a hand-waving of the problem, so I could give only a hand-waving of advice. Perhaps you would like to move a little closer to spelling out the Option 2 approach. And include SHOW CREATE TABLE.

Upvotes: 2

Related Questions