Ben
Ben

Reputation: 261

Why is mySQL query, left join 'considerably' faster than my inner join

I've researched this, but I still cannot explain why:

SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
INNER JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = 23155

Is significantly slower than:

SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
LEFT JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = 23155

115ms Vs 478ms. They are both using InnoDB and there are relationships defined. The 'card_legality' contains approx 200k rows, while the 'legality' table contains 11 rows. Here is the structure for each:

CREATE TABLE `card_legality` (
  `card_id` varchar(8) NOT NULL DEFAULT '',
  `legality_id` int(3) NOT NULL,
  `cl_boolean` tinyint(1) NOT NULL,
  PRIMARY KEY (`card_id`,`legality_id`),
  KEY `legality_id` (`legality_id`),
  CONSTRAINT `card_legality_ibfk_2` FOREIGN KEY (`legality_id`) REFERENCES `legality` (`legality_id`),
  CONSTRAINT `card_legality_ibfk_1` FOREIGN KEY (`card_id`) REFERENCES `card` (`card_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

And:

CREATE TABLE `legality` (
  `legality_id` int(3) NOT NULL AUTO_INCREMENT,
  `l_name` varchar(16) NOT NULL DEFAULT '',
  PRIMARY KEY (`legality_id`)
) ENGINE=InnoDB AUTO_INCREMENT=12 DEFAULT CHARSET=latin1;

I could simply use LEFT-JOIN, but it doesn't seem quite right... any thoughts, please?

UPDATE: As requested, I've included the results of explain for each. I had run it previously, but I dont pretend to have a thorough understanding of it..

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  cl  ALL PRIMARY NULL    NULL    NULL    199747  Using where
1   SIMPLE  l   eq_ref  PRIMARY PRIMARY 4   hexproof.co.uk.cl.legality_id   1   

AND, inner join:

id  select_type table   type    possible_keys   key key_len         ref                         rows    Extra
1   SIMPLE  l   ALL PRIMARY NULL    NULL    NULL    11  
1   SIMPLE  cl  ref PRIMARY,legality_id legality_id 4   hexproof.co.uk.l.legality_id    33799   Using where

Upvotes: 12

Views: 11607

Answers (4)

L2G
L2G

Reputation: 846

I'd try EXPLAIN on both of those queries. Just prefix each SELECT with EXPLAIN and run them. It gives really useful info on how mySQL is optimizing and executing queries.

Upvotes: 3

Andreas Wederbrand
Andreas Wederbrand

Reputation: 40061

It is because of the varchar on card_id. MySQL can't use the index on card_id as card_id as described here mysql type conversion. The important part is

For comparisons of a string column with a number, MySQL cannot use an index on the column to look up the value quickly. If str_col is an indexed string column, the index cannot be used when performing the lookup in the following statement:

SELECT * FROM tbl_name WHERE str_col=1;

The reason for this is that there are many different strings that may convert to the value 1, such as '1', ' 1', or '1a'.

If you change your queries to

SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
INNER JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = '23155'

and

SELECT cl.`cl_boolean`, l.`l_name`
FROM `card_legality` cl
LEFT JOIN `legality` l ON l.`legality_id` = cl.`legality_id`
WHERE cl.`card_id` = '23155'

You should see a huge improvement in speed and also see a different EXPLAIN.

Here is a similar (but easier) test to show this:

> desc id_test;
+-------+------------+------+-----+---------+-------+
| Field | Type       | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+-------+
| id    | varchar(8) | NO   | PRI | NULL    |       |
+-------+------------+------+-----+---------+-------+
1 row in set (0.17 sec)

> select * from id_test;
+----+
| id |
+----+
| 1  |
| 2  |
| 3  |
| 4  |
| 5  |
| 6  |
| 7  |
| 8  |
| 9  |
+----+
9 rows in set (0.00 sec)

> explain select * from id_test where id = 1;
+----+-------------+---------+-------+---------------+---------+---------+------+------+--------------------------+
| id | select_type | table   | type  | possible_keys | key     | key_len | ref  | rows | Extra                    |
+----+-------------+---------+-------+---------------+---------+---------+------+------+--------------------------+
|  1 | SIMPLE      | id_test | index | PRIMARY       | PRIMARY | 10      | NULL |    9 | Using where; Using index |
+----+-------------+---------+-------+---------------+---------+---------+------+------+--------------------------+
1 row in set (0.00 sec)


> explain select * from id_test where id = '1';
+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table   | type  | possible_keys | key     | key_len | ref   | rows | Extra       |
+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------------+
|  1 | SIMPLE      | id_test | const | PRIMARY       | PRIMARY | 10      | const |    1 | Using index |
+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------------+
1 row in set (0.00 sec)

In the first case there is Using where; Using index and the second is Using index. Also ref is either NULL or CONST. Needless to say, the second one is better.

Upvotes: 10

Stephane Gosselin
Stephane Gosselin

Reputation: 9148

L2G has it pretty much summed up, although I suspect it could be because of the varchar type used for card_id.

I actually printed out this informative page for benchmarking / profiling quickies. Here is a quick poor-mans profiling technique:

Time a SQL on MySQL
Enable Profiling
mysql> SET PROFILING = 1
...
RUN your SQLs
...
mysql> SHOW PROFILES;

+----------+------------+-----------------------+
| Query_ID | Duration   | Query                 |
+----------+------------+-----------------------+
|        1 | 0.00014600 | SELECT DATABASE()     |
|        2 | 0.00024250 | select user from user |
+----------+------------+-----------------------+
mysql> SHOW PROFILE for QUERY 2;

+--------------------------------+----------+
| Status                         | Duration |
+--------------------------------+----------+
| starting                       | 0.000034 |
| checking query cache for query | 0.000033 |
| checking permissions           | 0.000006 |
| Opening tables                 | 0.000011 |
| init                           | 0.000013 |
| optimizing                     | 0.000004 |
| executing                      | 0.000011 |
| end                            | 0.000004 |
| query end                      | 0.000002 |
| freeing items                  | 0.000026 |
| logging slow query             | 0.000002 |
| cleaning up                    | 0.000003 |
+--------------------------------+----------+

Good-luck, oh and please post your findings!

Upvotes: 4

K. Bob
K. Bob

Reputation: 2688

I'm pretty sure that MySql has better optimization for Left Joins - no evidence to back this up at the moment.

ETA : A quick scout round and I can't find anything concrete to uphold my view so.....

Upvotes: 0

Related Questions