Jaylen
Jaylen

Reputation: 40289

ORDER BY datetime makes the query very slow

I am trying to pull data from multiple tables and when I user ORDER BY a datetime field it return the results after at least 10 seconds but if I do the same query without ORDER BY then it return the results for under 2 seconds.

This is my current query

SELECT
ph.call_subject AS callSubject,
ac.account_name AS accountName,
DATE_FORMAT(ph.trigger_on, "%c/%e/%Y %h:%i %p") AS triggerOn,
ind.name AS industry,
cc.call_code_name AS callCode
FROM phone_calls AS ph
INNER JOIN accounts AS ac ON ph.account_id = ac.account_id
INNER JOIN industries AS ind ON ind.industry_id = ac.industry_id
INNER JOIN call_codes AS cc ON ph.call_code_id = cc.call_code_id
WHERE ac.status = 1 AND ph.status = 1 AND ph.owner_id = 1 AND ac.do_not_call = 0
AND ph.trigger_on BETWEEN '2012-11-19 00:00:00' AND '2013-03-19 23:59:59'
ORDER BY ph.trigger_on ASC LIMIT 0,1000

the following fields are all of the type INT(11) UNSIGNED

ph.account_id
ac.account_id
ind.industry_id
ac.industry_id
ph.call_code_id
cc.call_code_id
ph.owner_id

The following fields are all of the type tinyint(1)

ac.status 
ph.status
ac.do_not_call

this field is a datetime type

ph.trigger_on

Please note that has accounts 300K records and phone_calls has 5 million records. What can I do to preform this ORDER BY faster? note that all of my where clause fields, all my ON clause and ph.trigger_on are indexed. and I am using InnoDB storage engine not MyIsam.

Thanks

Upvotes: 5

Views: 14264

Answers (5)

Captain Payalytic
Captain Payalytic

Reputation: 1071

Please try this:

  1. Build an index on the columns (phone_calls.trigger_on, phone_calls.status, phone_calls.owner_id) Call it pcto

  2. Change your FROM clause to:

    FROM phone_calls AS ph FORCE INDEX (pcto)

This is the ideal. If it does not work, then add a comment and I will give you another method that it guaranteed to work and give you the performance improvement that you need.

PLEASE NOTE: It doesn't matter (and indeed does no good) to have indexes built on "every" column in your query. MySQL can only use ONE index per table (or more correctly per table alias). You need to build the indexes that we are telling you to.

Upvotes: 7

Nicolas Giszpenc
Nicolas Giszpenc

Reputation: 703

When you do a SELECT on (SELECT) aka, it's really like working on a temp table. The example below has a few joins on one main large table. This solution took the query down to 0.2 seconds vs 20 seconds when the ORDER BY is on the whole table query.

   SELECT * FROM (SELECT `cse_notes`.`notes_id`, `cse_notes`.`dateandtime`, 
    `cse_case`.`case_id`, `cse_case_notes`.`attribute`
    FROM  `cse_notes` 
    INNER JOIN  `cse_case_notes` 
    ON `cse_notes`.`notes_uuid` =  `cse_case_notes`.`notes_uuid`
    INNER JOIN `cse_case` 
    ON  `cse_case_notes`.`case_uuid` = `cse_case`.`case_uuid`
    WHERE `cse_notes`.`deleted` = 'N' AND `cse_case`.`case_id` = :case_id
    AND `cse_notes`.customer_id = :customer_id) notes
    ORDER BY `dateandtime` DESC

This is the bad query that runs very slowly. I thought it was just fine, I had no idea the entire table must be sorted before filtering begins. Indexing alone did not help.

    SELECT `cse_notes`.`notes_id`, `cse_notes`.`dateandtime`,
    `cse_case`.`case_id`, `cse_case_notes`.`attribute`    
    FROM  `cse_notes`     
    INNER JOIN  `cse_case_notes` ON `cse_notes`.`notes_uuid` =  `cse_case_notes`.`notes_uuid`    
    INNER JOIN `cse_case` ON  `cse_case_notes`.`case_uuid` = `cse_case`.`case_uuid`    
    WHERE `cse_notes`.`deleted` = 'N' 
    AND `cse_case`.`case_id` = :case_id    
    AND `cse_notes`.customer_id = :customer_id    
    ORDER BY `cse_notes`.dateandtime DESC LIMIT 0, 1000

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269643

This is to elaborate on Ersun's solution/comment.

Without the order by, SQL evaluates the query. In this case, it is a bunch of joins. Quite possibly, you have indexes on the join fields. So, the query proceeds by reading a record from phone_calls, looking up the data, checking the filter conditions, and returning it. It then goes to the the record and so on. Overall, it probably reads a few thousands or tens of thousands of records.

With the order by, SQL has to evaluate all the records in the query. It has to read all the phone calls, because the very last one might have the minimum value. It then does the sorting and returns the right records.

You might be able to speed up the query by having an index on phone_calls(status, owner_id, trigger_on) to satisfy the where clause.

Upvotes: 0

sidestepper
sidestepper

Reputation: 115

In my experience the fastest way to get performance from an SQL query is to simplify it into multiple steps. Take advantage of temporary tables and reduce the number of joins and operations per step(Eat memory, get speed). Forgive me for possible syntax errors below as I haven't used MySQL for a long time now, but you could rewrite your query as follows:

CREATE TEMPORARY TABLE scratch1 AS (
    SELECT
            ph.call_subject AS callSubject,
            ac.account_name AS accountName,
            DATE_FORMAT(ph.trigger_on, "%c/%e/%Y %h:%i %p") AS triggerOn,
            ac.industry_id,
            ph.call_code_id
    FROM
            phone_calls AS ph
            INNER JOIN accounts AS ac ON ph.account_id = ac.account_id
    WHERE   
            ac.status = 1 AND ph.status = 1 AND ph.owner_id = 1 AND ac.do_not_call = 0
            AND ph.trigger_on BETWEEN '2012-11-19 00:00:00' AND '2013-03-19 23:59:59' )

ALTER TABLE scratch1 ADD industry VARCHAR(255)
ALTER TABLE scratch1 ADD callCode VARCHAR(255)

UPDATE scratch1 s JOIN industries ind ON ind.industry_id = s.industry_id
SET s.industry = ind.name

UPDATE scratch1 s JOIN call_codes cc ON cc.call_code_id = s.call_code_id
SET s.callCode = cc.call_code_name

CREATE TEMPORARY TABLE scratch2 AS (
    SELECT * FROM scratch1 ORDER BY triggerOn ASC )

SELECT * FROM scratch2 LIMIT 0, 1000

Upvotes: 0

Ersun Warncke
Ersun Warncke

Reputation: 51

If you have a LIMIT of 5 rows then without the order the query can grab the first 5 rows it finds that match your other conditions.

If you have an ORDER BY clause it has to look at all the rows matching your other conditions and pick the 5 lowest.

Upvotes: 3

Related Questions