Seafko
Seafko

Reputation: 79

SQL Query is slow and not using indexes

SELECT `productTitle`, `orderCnt`, `promPCPriceStr`,
  `productImgUrl`, `oriPriceStr`, `detailUrl`,
  (SELECT count(id) FROM orders t4 
   WHERE t4.productId = t1.productId 
     AND DATE( t4.`date`) > DATE_SUB(CURDATE(), INTERVAL 2 DAY)
  ) as ordertoday
FROM `products` t1
WHERE `orderCnt` > 0 
 AND `orderCnt` < 2000 
 AND `promPCPriceStr` > 0 
 AND `promPCPriceStr` < 2000 
HAVING ordertoday > 5 AND ordertoday < 2000 
order by ordertoday desc limit 150

This query take 18 second to finish when i run explain command on it shows this

a busy cat

it does not use the index keys !

The tables used

Products Table

CREATE TABLE `products` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `productId` bigint(20) NOT NULL,
 `detailUrl` text CHARACTER SET utf32 NOT NULL,
 `belongToDSStore` int(11) NOT NULL,
 `promPCPriceStr` float NOT NULL DEFAULT '-1',
 `oriPriceStr` float NOT NULL DEFAULT '-1',
 `orderCnt` int(11) NOT NULL,
 `productTitle` text CHARACTER SET utf32 NOT NULL,
 `productImgUrl` text CHARACTER SET utf32 NOT NULL,
 `created_date` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
 `cat` bigint(20) NOT NULL DEFAULT '-1',
 PRIMARY KEY (`id`),
 UNIQUE KEY `productId` (`productId`),
 KEY `orderCnt` (`orderCnt`),
 KEY `cat` (`cat`),
 KEY `promPCPriceStr` (`promPCPriceStr`)
) ENGINE=InnoDB AUTO_INCREMENT=37773 DEFAULT CHARSET=latin1

Orders Table

CREATE TABLE `orders` (
 `oid` int(11) NOT NULL AUTO_INCREMENT,
 `countryCode` varchar(10) NOT NULL,
 `date` datetime NOT NULL,
 `id` bigint(20) NOT NULL,
 `productId` bigint(20) NOT NULL,
 PRIMARY KEY (`oid`),
 UNIQUE KEY `id` (`id`),
 KEY `date` (`date`),
 KEY `productId` (`productId`)
) ENGINE=InnoDB AUTO_INCREMENT=9790205 DEFAULT CHARSET=latin1

Upvotes: 0

Views: 109

Answers (2)

Rick James
Rick James

Reputation: 142298

In both tables, it is counterproductive to have both an AUTO_INCREMENT PRIMARY KEY and a BIGINT column that is UNIQUE. Get rid of the AI column and promote the other to PK. This may require changing some of your code, since the AI column is gone.

As for the subquery...

  (SELECT count(id) FROM orders t4 
   WHERE t4.productId = t1.productId 
     AND DATE( t4.`date`) > DATE_SUB(CURDATE(), INTERVAL 2 DAY)
  ) as ordertoday

Change COUNT(id) to COUNT(*) unless you need to check id for being NOT NULL (which I doubt).

The date column is hidden in a function call, so no index will be useful. So, change the date test to

AND t4.`date` > CURDATE - INTERVAL 2 DAY

Then add this composite index. (It will help with Karwin's reformulation, too).

INDEX(productId, date)

Upvotes: 1

Bill Karwin
Bill Karwin

Reputation: 562330

MySQL won't use an index even if one exists on a column you search, if the values you search for appear on a large subset of the rows.

I did a test with MySQL 5.6. I created table with ~1,000,000 rows, with a column x with random values evenly distributed between 1 and 1000. There's an index on column x.

Depending on my search terms, I see the index is used if I search for a range of values matching a small enough subset of rows, otherwise it decides using the index is too much trouble, and just does a table-scan:

mysql> explain select * from foo where x < 50;
+----+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows   | Extra                 |
+----+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
|  1 | SIMPLE      | foo   | range | x             | x    | 4       | NULL | 102356 | Using index condition |
+----+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+

mysql> explain select * from foo where x < 100;
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | foo   | ALL  | x             | NULL | NULL    | NULL | 1046904 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+

I would infer that your query's search conditions match a pretty large portion of the rows, and MySQL decides the indexes on these columns are not worth using.

WHERE `orderCnt` > 0 
 AND `orderCnt` < 2000 
 AND `promPCPriceStr` > 0 
 AND `promPCPriceStr` < 2000 

If you think MySQL is making the wrong choice, you can try to use an index hint to tell MySQL that a table-scan is prohibitively expensive. This will urge it to use the index (if the index is relevant to the search condition).

mysql> explain select * from foo force index (x) where x < 100;
+----+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows   | Extra                 |
+----+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+
|  1 | SIMPLE      | foo   | range | x             | x    | 4       | NULL | 216764 | Using index condition |
+----+-------------+-------+-------+---------------+------+---------+------+--------+-----------------------+

I would write the query this way, without any subquery:

SELECT t.productTitle, t.orderCnt, t.promPCPriceStr,
  t.productImgUrl, t.oriPriceStr, t.detailUrl,
  COUNT(o.id) AS orderToday
FROM products t
LEFT JOIN orders o ON t.productid = o.productid AND o.date > CURDATE() - INTERVAL 2 DAY
WHERE t.orderCnt > 0 AND t.orderCnt < 2000
 AND t.promPCPriceStr > 0 AND t.promPCPriceStr < 2000
GROUP BY t.productid
HAVING ordertoday > 5 AND ordertoday < 2000
ORDER BY ordertoday DESC LIMIT 150

When I EXPLAIN the query, I get this report:

+----+-------------+-------+------+-----------------------------------+-----------+---------+------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys                     | key       | key_len | ref              | rows | Extra                                        |
+----+-------------+-------+------+-----------------------------------+-----------+---------+------------------+------+----------------------------------------------+
|  1 | SIMPLE      | t     | ALL  | productId,orderCnt,promPCPriceStr | NULL      | NULL    | NULL             | 9993 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | o     | ref  | date,productId                    | productId | 8       | test.t.productId |    1 | Using where                                  |
+----+-------------+-------+------+-----------------------------------+-----------+---------+------------------+------+----------------------------------------------+

It still does a table-scan for products but it joins the relevant matching rows in orders with an index lookup instead of a correlated subquery.

I filled my tables with random date, to make 98,846 product rows and 215,508 orders rows. When I run the query it takes about 0.18 seconds.

Although when I run your query with the correlated subquery, it takes 0.06 seconds. I don't know why your query is so slow. You could be running on an underpowered server.

I'm running my test on a Macbook Pro 2017 with an i7 CPU and 16GB of RAM.

Upvotes: 3

Related Questions