Ronak Patel
Ronak Patel

Reputation: 3849

mysql - long running query without proper indexes

this query runs about 15 hours in production, I am looking for alternatives to improvements to this,

some improvements those are I think may help are commented in here:

SELECT  table1.*
    FROM  table1
    WHERE  UPPER(LEFT(table1.cloumn1, 1)) IN ('A', 'B')
      AND  table1.cloumn2 = 'N' /* add composite index for cloumn2,
        column3 */
      AND  table1.cloumn3 != 'Y'
      AND  table1.id IN (
        SELECT  MAX(id)
            FROM  table1
            GROUP BY  column5,column6
                        ) /* move this clause to 2nd after
    where  */
      AND  table1.column4 IN (
        SELECT  column1
            FROM  table2
            WHERE  column2 IN ('VALUE1', 'VALUE2')
              AND  (SUBSTRING(column3,6,1) = 'Y'
                      OR  SUBSTRING(column3,25,1) = 'Y')
                          ) /* move this clause to 1st after
    where  */
      AND  (table1.column5,table1.column6) NOT IN (
        SELECT  column1, column2
            FROM  table3
            WHERE  table3.column3 IN ('A', 'B')/* add index for this column*/
           )
      AND  DATE_FORMAT(timstampColumn, '%Y/%m/%d') > DATE_ADD(CURRENT_DATE,
                INTERVAL - 28 DAY)) /* need index  ON this col? */ ;

Any comments/suggestions are appreciated.

Update: with only updating filtering order, Query performance was improved to ~ 28 secs, will update here after adding some indexes and replacing some subqueries to joins

Upvotes: 2

Views: 126

Answers (2)

Rick James
Rick James

Reputation: 142346

(It might help to see SHOW CREATE TABLE.)

AND  DATE_FORMAT(timstampColumn, '%Y/%m/%d') > DATE_ADD(CURRENT_DATE,
            INTERVAL - 28 DAY))

cannot use an index; this might be equivalent:

AND  timstampColumn > CURRENT_DATE - INTERVAL 28 DAY

Please provide EXPLAIN.

What version are you using?

It might (version dependent) help to turn the IN ( SELECT ... ) clauses into 'derived' tables:

JOIN ( SELECT ... ) ON ...

WHERE (x,y) IN ... is not well optimized. What types of values are they?

With a *_ci collation,

UPPER(LEFT(table1.cloumn1, 1)) IN ('A', 'B')

could be done:

LEFT(table1.cloumn1, 1) IN ('A', 'B')

That won't help performance noticeably. It would be better not to have to break apart columns for testing.

This might use an index involving cloumn1:

    table1.cloumn1 >= 'A'
AND table1.cloumn1 <  'C'

The order of things AND'd together rarely matters. The order in the INDEX can make a big difference.

Upvotes: 0

Kickstart
Kickstart

Reputation: 21523

Assuming you can add useful indexes (which will help on some of your checks), then maybe try and exclude rows as early as possible.

I suspect you have quite a few rows on table1 for each column5 / column6 combination. If you can get just the latest of each of these (ie, using a sub query that you join) as early as possible then you can exclude most rows from table1 before you need to check any of the non indexed WHERE clauses. You can also exclude some of these by doing a further join against a sub query on table3.

Not tested, but if my assumptions about your database structure are correct then this might be an improvement:-

SELECT table1.* 
FROM 
(
    SELECT MAX(table1.id) AS max_id
    FROM table1 
    INNER JOIN 
    (
        SELECT DISTINCT column1, column2 
        FROM table3
        WHERE table3.column3 IN ('A', 'B')
        AND DATE_FORMAT(timstampColumn, '%Y/%m/%d') > DATE_ADD(CURRENT_DATE, INTERVAL - 28 DAY)
    ) sub0_0
    ON table1.column5 = sub0_0.column1
    AND  table1.column6 = sub0_0.column2
    WHERE (table1.cloumn1 LIKE 'A%' OR table1.cloumn1 LIKE 'B%')
    AND table1.cloumn2 = 'N'
    AND table1.cloumn3 != 'Y'
    GROUP BY table1.column5,
            table1.column6
) sub0
INNER JOIN table1
ON table1.id = sub0.max_id
INNER JOIN
(
    SELECT DISTINCT column1 
    FROM table2
    WHERE column2 IN ('VALUE1', 'VALUE2') 
    AND (SUBSTRING(column3,6,1) = 'Y' 
    OR SUBSTRING(column3,25,1) = 'Y')
) sub1
ON table1.column4 = sub1.column1

Upvotes: 1

Related Questions