Ryan
Ryan

Reputation: 15250

Why would a MySQL SELECT query take 1-2 minutes to run if EXPLAIN only shows 400 rows?

I need to fetch the most recent 1000 records from a large table (70M rows) matching a couple well-indexed items through INNER JOINs on two simple and small tables.

The query is taking 1-2 minutes to run. Yet the explain only shows several hundred rows to look through. What gives?

How can I optimize the query or more efficiently index the tables to make this query run in the milliseconds I expect?

The Tables:

score    70,000,000 records
class           400 records
category        400 records

The query:

SELECT
    s.log_id,
    s.category_id
FROM
    score s
    INNER JOIN category ca ON s.category_id = ca.id
    INNER JOIN class cl ON ca.class_id = cl.id
WHERE
        s.score_status_type_id = 0
    AND ca.category_status_id = 1
    AND cl.class_status_id IN (1, 2)
    AND s.date > DATE_ADD(NOW(), INTERVAL -1440 minute)
GROUP BY s.log_id
ORDER BY s.date DESC
LIMIT 1000:

Here's the explain:

*** row 1 ***
          table:  cl
           type:  range
  possible_keys:  PRIMARY,class_status_id
            key:  class_status_id
        key_len:  4
            ref:  NULL
           rows:  36
          Extra:  Using where; Using index; Using temporary; Using filesort
*** row 2 ***
          table:  ca
           type:  ref
  possible_keys:  PRIMARY,class_id,category_status_id,category_status_id_class_id_id
            key:  category_status_id_class_id_id
        key_len:  8
            ref:  const,my_db.cl.id
           rows:  1
          Extra:  Using index
*** row 3 ***
          table:  s
           type:  ref
  possible_keys:  unique_key,category_id,date,score,score_status_type_id,score_status_and_date,category_id_score_status_type_id_date_log_id,date_reverse,category_id_date_reverse,score_date
            key:  category_id_score_status_type_id_date_log_id
        key_len:  8
            ref:  my_db.ca.id,const
           rows:  396
          Extra:  Using where; Using index

Here are some of the create tables:

CREATE TABLE `score` (
  `log_id` bigint(20) NOT NULL,
  `profile_id` bigint(20) DEFAULT NULL,
  `date` datetime DEFAULT NULL,
  `class_id` int(11) NOT NULL,
  `score` float(10,6) DEFAULT NULL,
  `score_date` datetime DEFAULT NULL,
  `process_date` datetime DEFAULT NULL,
  `status_type_id` int(3) NOT NULL DEFAULT '0',
  `date_reverse` int(11) DEFAULT NULL,
  UNIQUE KEY `unique_key` (`log_id`,`class_id`),
  KEY `class_id` (`class_id`),
  KEY `profile_id` (`profile_id`),
  KEY `date` (`date`),
  KEY `score` (`score`),
  KEY `status_type_id` (`status_type_id `),
  KEY `status_type_id_date` (`status_type_id`,`date`),
  KEY `class_status_type_id_date_log_id` (`class_id`,`status_type_id`,`date`,`log_id`),
  KEY `date_reverse` (`date_reverse`),
  KEY `class_id_date_reverse` (`class_id`,`date_reverse`),
  KEY `date` (`date`),
  KEY `class_id_date_reverse_log_id` (`class_id`,`date_reverse`,`log_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE TABLE `category` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `class_id` int(11) NOT NULL,
  `category_status_id` int(11) NOT NULL DEFAULT '0',
  `name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  PRIMARY KEY (`id`),
  KEY `class_id` (`class_id`),
  KEY `name` (`name`),
  KEY `category_status_id_class_id_id` (`category_status_id`,`class_id`,`id`)
) ENGINE=InnoDB AUTO_INCREMENT=412 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

CREATE TABLE `class` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `class_status_id` int(11) NOT NULL DEFAULT '1',
  `name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  PRIMARY KEY (`id`),
  KEY `person_id` (`person_id`),
  KEY `name` (`name`),
  KEY `class_status_id` (`class_status_id`),
  KEY `class_multi_1` (`class_status_id`,`name`,`id`)
) ENGINE=InnoDB AUTO_INCREMENT=407 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

Upvotes: 1

Views: 538

Answers (1)

Bohemian
Bohemian

Reputation: 424973

The problem is that the where clause is a filter that is applied after the joins are made, so your joined-table conditions being in the where clause require that the join actually be made and put into the temporary result set (which could be huge). Usually the optimizer recognises that the condition can be asserted at join time, but sometimes it can be a bit dense, so...

Try moving the non-key conditions into the joins

SELECT s.log_id, s.category_id
FROM score s
JOIN category ca ON s.category_id = ca.id
    AND ca.category_status_id = 1
JOIN class cl ON ca.class_id = cl.id
    AND cl.class_status_id IN (1, 2)
WHERE s.score_status_type_id = 0
AND s.date > DATE_ADD(NOW(), INTERVAL -1440 minute)
GROUP BY s.log_id
ORDER BY s.date DESC
LIMIT 1000

If that doesn't help enough, try first grabbing the subset of score rows as a subquery, then doing your joins to that.

Upvotes: 1

Related Questions