aqua
aqua

Reputation: 647

Performance of multi-table select statements

I'm extremely new to SQL and I'm trying to figure out the "best way" to perform selections when multiple tables are involved. Unfortunately, my dataset is pretty small (thereby making this question fairly academic), so everything returns in the same amount of time regardless.

I have the following tables (but with less abstract data names):

mysql> describe tblA
+----------------+-------------+------+-----+---------+----------------+
| Field          | Type        | Null | Key | Default | Extra          |
+----------------+-------------+------+-----+---------+----------------+
| ID             | bigint(20)  | NO   | PRI | NULL    | auto_increment |
| A_data         | varchar(20) | YES  |     | null    |                |
| B_ID           | bigint(20)  | NO   | MUL | NULL    |                |
+----------------+-------------+------+-----+---------+----------------+
mysql> describe tblB
+----------------+-------------+------+-----+---------+----------------+
| Field          | Type        | Null | Key | Default | Extra          |
+----------------+-------------+------+-----+---------+----------------+
| ID             | bigint(20)  | NO   | PRI | NULL    | auto_increment |
| B_data         | varchar(20) | YES  |     | null    |                |
| C_ID           | bigint(20)  | NO   | MUL | NULL    |                |
+----------------+-------------+------+-----+---------+----------------+
mysql> describe tblC
+----------------+-------------+------+-----+---------+----------------+
| Field          | Type        | Null | Key | Default | Extra          |
+----------------+-------------+------+-----+---------+----------------+
| ID             | bigint(20)  | NO   | PRI | NULL    | auto_increment |
| C_data         | varchar(20) | YES  |     | null    |                |
+----------------+-------------+------+-----+---------+----------------+

There seem to be three ways that I can arrange a query to get the A_data given a particular value for C_data:

SELECT A_data FROM tblA,tblB,tblC
  WHERE (tblC.C_data=searchKey AND tblB.C_ID=tblC.ID AND tblA.B_ID=tblB.ID)

or

SELECT A_data FROM tblA
  INNER JOIN tblB ON tblA.tblB_ID=tblB.ID
  INNER JOIN tblC ON tblB.tblC_ID=tblC.ID
  WHERE tblC.C_data=searchKey

or

SELECT A_data FROM tblA
  WHERE tblB_ID IN (
    SELECT ID FROM tblB WHERE tblC_ID IN (
      SELECT ID FROM tblC WHERE tblC_data=searchKey
    )
  )

I'm curious about the relative ramifications and merits (if any) of these techniques. For instance, joining three tables just to do a search seems to defeat the point of keeping them in separate tables, but conversely I'm unsure of the wisdom of nested selects. I'm also not sure exactly what's happening in the background with the first case.

Upvotes: 0

Views: 47

Answers (2)

sjlxndr
sjlxndr

Reputation: 46

Caveat: not my area of expertise, and this is not a direct answer to your question. However, it might be useful to you to check out the MySQL optimizer tracing option, as it should shed some light on what the query optimizer does with your SQL and how it goes about actually executing it. Relevant links:

http://dev.mysql.com/doc/internals/en/optimizer-tracing.html - (official oracle documentation) http://jorgenloland.blogspot.com/2011/10/optimizer-tracing-query-execution-plan.html - (notes on usage and output and what it means, from an optimizer dev)

Upvotes: 0

Code Different
Code Different

Reputation: 93141

Use the second form. It is the most common and most readable of the three.

  • It's likely that MySQL can internally make the first form the same as second. Check the execution plan. Any how, it's not easy to read as the relationships between the 3 tables are not immediately obvious.
  • The third form is much harder to optimize, especially with more tables involved. Avoid it.

Upvotes: 1

Related Questions