Kenny Cason
Kenny Cason

Reputation: 12328

MySQL optimization - large table joins

To start out here is a simplified version of the tables involved.

tbl_map has approx 4,000,000 rows, tbl_1 has approx 120 rows, tbl_2 contains approx 5,000,000 rows. I know the data shouldn't be consider that large given that Google, Yahoo!, etc use much larger datasets. So I'm just assuming that I'm missing something.

    CREATE TABLE `tbl_map` (
      `id` bigint(20) NOT NULL AUTO_INCREMENT,
      `tbl_1_id` bigint(20) DEFAULT '-1',
      `tbl_2_id` bigint(20) DEFAULT '-1',
      `rating` decimal(3,3) DEFAULT NULL,
      PRIMARY KEY (`id`),
      KEY `tbl_1_id` (`tbl_1_id`),
      KEY `tbl_2_id` (`tbl_2_id`)
    ) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

    CREATE TABLE `tbl_1` (
      `id` bigint(20) NOT NULL AUTO_INCREMENT,
      PRIMARY KEY (`id`)
   ) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

    CREATE TABLE `tbl_2` (
      `id` bigint(20) NOT NULL AUTO_INCREMENT,
      `data` varchar(255) NOT NULL DEFAULT '',
      PRIMARY KEY (`id`),
    ) ENGINE=InnoDB  DEFAULT CHARSET=utf8;

The Query in interest: also, instead of ORDER BY RAND(), ORDERY BY t.id DESC. The query is taking as much as 5~10 seconds and causes a considerable wait when users view this page.

EXPLAIN SELECT t.data, t.id , tm.rating
FROM tbl_2 AS t
JOIN tbl_map AS tm 
ON t.id = tm.tbl_2_id
WHERE tm.tbl_1_id =94
AND tm.rating IS NOT NULL
ORDER BY t.id DESC
LIMIT 200 

1   SIMPLE  tm  ref     tbl_1_id, tbl_2_id  tbl_1_id    9   const   703438  Using where; Using temporary; Using filesort
1   SIMPLE  t   eq_ref  PRIMARY     PRIMARY     8   tm.tbl_2_id     1 

I would just liked to speed up the query, ensure that I have proper indexes, etc. I appreciate any advice from DB Gurus out there! Thanks.

Upvotes: 4

Views: 2564

Answers (2)

RolandoMySQLDBA
RolandoMySQLDBA

Reputation: 44343

SUGGESTION : Index the table as follows:

ALTER TABLE tbl_map ADD INDEX (tbl_1_id,rating,tbl_2_id);

Upvotes: 2

DRapp
DRapp

Reputation: 48139

As per Rolando, yes, you definitely need an index on the map table but I would expand to ALSO include the tbl_2_id which is for your ORDER BY clause of Table 2's ID (which is in the same table as the map, so just use that index. Also, since the index now holds all 3 fields, and is based on the ID of the key search and criteria of null (or not) of rating, the 3rd element has them already in order for your ORDER BY clause.

INDEX (tbl_1_id,rating, tbl_2_id);

Then, I would just have the query as

SELECT STRAIGHT_JOIN 
      t.data, 
      t.id , 
      tm.rating
   FROM 
      tbl_map tm
         join tbl_2 t
            on tm.tbl_2_id = t.id
   WHERE 
          tm.tbl_1_id = 94
      AND tm.rating IS NOT NULL
   ORDER BY 
      tm.tbl_2_id DESC
   LIMIT 200 

Upvotes: 2

Related Questions