SQL Query is slow and not using indexes

Question

SELECT `productTitle`, `orderCnt`, `promPCPriceStr`,
  `productImgUrl`, `oriPriceStr`, `detailUrl`,
  (SELECT count(id) FROM orders t4 
   WHERE t4.productId = t1.productId 
     AND DATE( t4.`date`) > DATE_SUB(CURDATE(), INTERVAL 2 DAY)
  ) as ordertoday
FROM `products` t1
WHERE `orderCnt` > 0 
 AND `orderCnt` < 2000 
 AND `promPCPriceStr` > 0 
 AND `promPCPriceStr` < 2000 
HAVING ordertoday > 5 AND ordertoday < 2000 
order by ordertoday desc limit 150

This query take 18 second to finish when i run explain command on it shows this

a busy cat

it does not use the index keys !

The tables used

Products Table

CREATE TABLE `products` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `productId` bigint(20) NOT NULL,
 `detailUrl` text CHARACTER SET utf32 NOT NULL,
 `belongToDSStore` int(11) NOT NULL,
 `promPCPriceStr` float NOT NULL DEFAULT '-1',
 `oriPriceStr` float NOT NULL DEFAULT '-1',
 `orderCnt` int(11) NOT NULL,
 `productTitle` text CHARACTER SET utf32 NOT NULL,
 `productImgUrl` text CHARACTER SET utf32 NOT NULL,
 `created_date` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
 `cat` bigint(20) NOT NULL DEFAULT '-1',
 PRIMARY KEY (`id`),
 UNIQUE KEY `productId` (`productId`),
 KEY `orderCnt` (`orderCnt`),
 KEY `cat` (`cat`),
 KEY `promPCPriceStr` (`promPCPriceStr`)
) ENGINE=InnoDB AUTO_INCREMENT=37773 DEFAULT CHARSET=latin1

Orders Table

CREATE TABLE `orders` (
 `oid` int(11) NOT NULL AUTO_INCREMENT,
 `countryCode` varchar(10) NOT NULL,
 `date` datetime NOT NULL,
 `id` bigint(20) NOT NULL,
 `productId` bigint(20) NOT NULL,
 PRIMARY KEY (`oid`),
 UNIQUE KEY `id` (`id`),
 KEY `date` (`date`),
 KEY `productId` (`productId`)
) ENGINE=InnoDB AUTO_INCREMENT=9790205 DEFAULT CHARSET=latin1

Bill Karwin · Accepted Answer

MySQL won't use an index even if one exists on a column you search, if the values you search for appear on a large subset of the rows.

I did a test with MySQL 5.6. I created table with ~1,000,000 rows, with a column x with random values evenly distributed between 1 and 1000. There's an index on column x.

Depending on my search terms, I see the index is used if I search for a range of values matching a small enough subset of rows, otherwise it decides using the index is too much trouble, and just does a table-scan:

mysql> explain select * from foo where x < 50;
+----+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows   | Extra                 |
+----+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
|  1 | SIMPLE      | foo   | range | x             | x    | 4       | NULL | 102356 | Using index condition |
+----+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+

mysql> explain select * from foo where x < 100;
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | foo   | ALL  | x             | NULL | NULL    | NULL | 1046904 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+

I would infer that your query's search conditions match a pretty large portion of the rows, and MySQL decides the indexes on these columns are not worth using.

WHERE `orderCnt` > 0 
 AND `orderCnt` < 2000 
 AND `promPCPriceStr` > 0 
 AND `promPCPriceStr` < 2000

If you think MySQL is making the wrong choice, you can try to use an index hint to tell MySQL that a table-scan is prohibitively expensive. This will urge it to use the index (if the index is relevant to the search condition).

mysql> explain select * from foo force index (x) where x < 100;
+----+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows   | Extra                 |
+----+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
|  1 | SIMPLE      | foo   | range | x             | x    | 4       | NULL | 216764 | Using index condition |
+----+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+

I would write the query this way, without any subquery:

SELECT t.productTitle, t.orderCnt, t.promPCPriceStr,
  t.productImgUrl, t.oriPriceStr, t.detailUrl,
  COUNT(o.id) AS orderToday
FROM products t
LEFT JOIN orders o ON t.productid = o.productid AND o.date > CURDATE() - INTERVAL 2 DAY
WHERE t.orderCnt > 0 AND t.orderCnt < 2000
 AND t.promPCPriceStr > 0 AND t.promPCPriceStr < 2000
GROUP BY t.productid
HAVING ordertoday > 5 AND ordertoday < 2000
ORDER BY ordertoday DESC LIMIT 150

When I EXPLAIN the query, I get this report:

+----+-------------+-------+------+-----------------------------------+-----------+---------+------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys                     | key       | key_len | ref              | rows | Extra                                        |
+----+-------------+-------+------+-----------------------------------+-----------+---------+------------------+------+----------------------------------------------+
|  1 | SIMPLE      | t     | ALL  | productId,orderCnt,promPCPriceStr | NULL      | NULL    | NULL             | 9993 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | o     | ref  | date,productId                    | productId | 8       | test.t.productId |    1 | Using where                                  |
+----+-------------+-------+------+-----------------------------------+-----------+---------+------------------+------+----------------------------------------------+

It still does a table-scan for products but it joins the relevant matching rows in orders with an index lookup instead of a correlated subquery.

I filled my tables with random date, to make 98,846 product rows and 215,508 orders rows. When I run the query it takes about 0.18 seconds.

Although when I run your query with the correlated subquery, it takes 0.06 seconds. I don't know why your query is so slow. You could be running on an underpowered server.

I'm running my test on a Macbook Pro 2017 with an i7 CPU and 16GB of RAM.

SQL Query is slow and not using indexes

Answers (2)

Related Questions