chaos
chaos

Reputation: 1671

Does `join` generate Cartesian product every time when using mysql?

I' learned that when we use join(left join or right join, not inner join) statement in mysql, mysql will generate a Cartesian Product as temporary table.

But somebody told me that if the columns which used in on statement have index, Cartesian Product won't be generated.

I'm not sure if he is right, because I can't find article or manual talks about it. It's he right?

By the way, somebody told me not to use join statement in production environment because it may have potential problem. But I think using join is quite no harm. And if we optimize the sql carefully, we will have no worry about performance problems. Should we forbid to use join in production environment? Thank you.

Upvotes: 1

Views: 1640

Answers (1)

Rick James
Rick James

Reputation: 142433

A Cartesian product, aka, CROSS JOIN, happens whenever you have a JOIN, including LEFT, RIGHT, and INNER, without either an ON clause or the equivalent in the WHERE clause.

A temp table may or may not be generated by any kind of JOIN. And it may or may not hit the disk.

JOINs in production. Sure. A well-indexed (etc) query is plenty fast. And, when you need a JOIN, the alternatives, if any, may be worse.

From a theoretical point of view, a JOIN is performed thus:

  1. Create the Cartesian product.
  2. Toss any rows that don't match the ON and WHERE restrictions.
  3. Move on to GROUP BY, HAVING, ORDER BY, and LIMIT.

In reality, the Optimizer takes all the short cuts it can think of. A typical JOIN goes more like this:

  1. Scan through one table, filtering out any rows not matching the WHERE.
  2. Reach into the joined table using an index to find the 0 or 1 or several rows there. Name: NLJ - Nested Loop Join
  3. Toss any more rows that don't match the rest of the ON and WHERE restrictions.
  4. etc.

As for LEFT and RIGHT -- Don't use them unless you need to get a result row even if the row is missing from the 'right' or 'left', respectively, table. It confuses the user and makes the Optimizer work harder to decide that you really meant INNER JOIN.

In MySQL, the keywords INNER and OUTER are essentially ignored. The existence of a suitable ON or WHERE controls what type of JOIN it is.

Upvotes: 3

Related Questions