Daniel Koch
Daniel Koch

Reputation: 600

MySQL query optimization: how can I speed up this query?

This is my table:

  CREATE TABLE `tab_adasf` (
  `adasf_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `adasf_shopId` int(10) unsigned NOT NULL,
  `adasf_localId` bigint(20) unsigned NOT NULL,
  `adasf_shopState` varchar(255) DEFAULT NULL,
  `adasf_shopCity` varchar(255) DEFAULT NULL,
  `adasf_shopName` varchar(255) DEFAULT NULL,
  `adasf_shopDoor` varchar(255) DEFAULT NULL,
  `adasf_computerName` varchar(255) DEFAULT NULL,
  `adasf_channel` bigint(20) NOT NULL,
  `adasf_totalInside` bigint(20) NOT NULL,
  `adasf_totalOutside` bigint(20) NOT NULL,
  `adasf_createdAt` datetime NOT NULL,
  PRIMARY KEY (`adasf_id`),
  KEY `adasf_shopId` (`adasf_shopId`),
  KEY `adasf_localId` (`adasf_localId`),
  KEY `adasf_shopState` (`adasf_shopState`,`adasf_shopCity`,`adasf_shopName`,`adasf_shopDoor`),
  KEY `adasf_computerName` (`adasf_computerName`,`adasf_channel`,`adasf_createdAt`),
  CONSTRAINT `tab_adasf_ibfk_1` FOREIGN KEY (`adasf_shopId`) REFERENCES `tab_shop` (`shop_id`) ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1453500 DEFAULT CHARSET=utf8

As AUTO_INCREMENT's value says: it has 1453500 rows.

In order to generate a XML file, I need the resultset as follows:

SELECT
    UPPER(adasf_shopState) AS adasf_shopState,
    UPPER(adasf_shopCity) AS adasf_shopCity,
    UPPER(adasf_shopName) AS adasf_shopName,
    UPPER(adasf_shopDoor) AS adasf_shopDoor,
    adasf_computerName,
    adasf_channel,
    SUM(adasf_totalInside) AS adasf_totalInside,
    SUM(adasf_totalOutside) AS adasf_totalOutside,
    YEAR(adasf_createdAt) AS year,
    MONTH(adasf_createdAt) AS month,
    DAY(adasf_createdAt) AS day,
    HOUR(adasf_createdAt) AS hour
    FROM tab_adasf 
WHERE 1=1 AND adasf_shopId = '1' AND HOUR(adasf_createdAt) BETWEEN '10:00' AND '21:00' 
GROUP BY
    UPPER(adasf_shopState),
    UPPER(adasf_shopCity),
    UPPER(adasf_shopName),
    UPPER(adasf_shopDoor),
    adasf_computerName,
    adasf_channel,
    YEAR(adasf_createdAt),
    MONTH(adasf_createdAt),
    DAY(adasf_createdAt),
    HOUR(adasf_createdAt)
ORDER BY
    UPPER(adasf_shopState),
    UPPER(adasf_shopCity),
    UPPER(adasf_shopName),
    UPPER(adasf_shopDoor),
    UPPER(adasf_computerName),
    adasf_channel,
    adasf_createdAt

It takes 3 minutes to run and fetch.

My question is: what am I doing wrong? How can I speed up this query or table?

Thanks in advance!

Upvotes: 0

Views: 64

Answers (3)

O. Jones
O. Jones

Reputation: 108641

As others have said, a query like this that reports on the whole table is inherently time-consuming. That being said, here's a suggestion or two.

First, eliminate the UPPER() function calls in the GROUP BY clause. MySQL's collations are case-insensitive anyhow.

Second, instead of GROUP BY year, month, day, hour, try using this expression in the GROUP BY.

 DATE_FORMAT(adasf_createdAt, '%Y-%m-%d %H:00:00')

This will basically round off your createdAt value to the immediately preceding hour.

Third, let's refactor the WHERE item that reads

 HOUR(adasf_createdAt) BETWEEN '10:00' AND '21:00'

It should say

 HOUR(adasf_createdAt) BETWEEN 10 AND 21

And if you remove that from your main query it will speed up. You can then wrap your query in another query like so:

SELECT *
  FROM ( /*your whole query without the WHERE HOUR() BETWEEN clause */
       ) AS q
  WHERE q.hour BETWEEN 10 AND 21

Finally, try creating a compound covering index on

 adasf_shopId, adasf_shopState, adasf_shopCity, adasf_shopName, adasf_shopDoor,
 adasf_computerName, adasf_channel,  adasf_CreatedAt, 
 adasf_totalInside, adasf_totalOutside

This index has all the information required to satisfy your query arranged in sequential order. It's possible this will speed up your query.

So, your ultimate query looks like this:

SELECT *
  FROM (
SELECT
       UPPER(adasf_shopState) AS adasf_shopState,
       UPPER(adasf_shopCity) AS adasf_shopCity,
       UPPER(adasf_shopName) AS adasf_shopName,
       UPPER(adasf_shopDoor) AS adasf_shopDoor,
       adasf_computerName,
       adasf_channel,
       SUM(adasf_totalInside) AS adasf_totalInside,
       SUM(adasf_totalOutside) AS adasf_totalOutside,
       YEAR(adasf_createdAt) AS year,
       MONTH(adasf_createdAt) AS month,
       DAY(adasf_createdAt) AS day,
       HOUR(adasf_createdAt) AS hour
 FROM tab_adasf 
WHERE 1=1 
  AND adasf_shopId = '1'  
GROUP BY
      adasf_shopState,
      adasf_shopCity,
      adasf_shopName,
      adasf_shopDoor,
      adasf_computerName,
      adasf_channel, 
      DATE_FORMAT(adasf_createdAt, '%Y-%m-%d %H:00:00')
ORDER BY
      adasf_shopState,
      adasf_shopCity,
      adasf_shopName,
      adasf_shopDoor,
      adasf_computerName,
      adasf_channel, 
      DATE_FORMAT(adasf_createdAt, '%Y-%m-%d %H:00:00')
     ) AS q
  WHERE q.hour BETWEEN 10 AND 21

It's possible this simplification of your query, combined with the covering index, will make the query faster.

Please note that I haven't debugged this query and don't have the test data to do so.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269443

To speed up the query, you can create an index on tab_adasf(adasf_shopId). This should help performance a lot if you have many shops.

If you need to do a lot of queries of this type, then consider splitting the adasf_createdAt column into a date component and a time component. Then you can create an index on tab_adasf(adasf_shopId, adasf_createdAt_time), further helping the query.

In general splitting the time from the datetime is not recommended unless you have a good reason. Increasing performance of this type of query constitutes a "good reason".

Upvotes: 2

Andy Lester
Andy Lester

Reputation: 93636

Edit: This answer will not work in MySQL.

You're having to use a full table scan to check every row to see if it matches HOUR(adasf_createdAt) BETWEEN '10:00' AND '21:00'.

Create a functional index on HOUR(adasf_createdAt).

Also, see http://use-the-index-luke.com for an excellent tutorial on how to use indexes effectively.

Upvotes: -1

Related Questions