blacksun
blacksun

Reputation: 733

MySQL query takes ages to return on a huge table

I have a very big table and following code takes 990 sec. to complete. bdate and itype are indexed. What else do I need to optimize/change?

SELECT s, count(*) as total
FROM  `mt_ex_15` 
WHERE bdate > '2014-10-01' and bdate < '2014-11-01'
and itype = '3'
group by s
order by total desc

EDIT: Here is the EXPLAIN

id  select_type table   type    possible_keys   key key_len ref rows    Extra   
1   SIMPLE  mt_ex_15    ref itype,bdate,s   itype   2   const   44157686    Using where; Using temporary; Using filesort    

EDIT: I think I need to optimize my DB or my.cnf because even the following query took 40 secs.

SELECT count(*) as total
FROM  `mt_ex_15` 
WHERE bdate > '2015-02-01' and bdate < '2015-03-01'

And here is the explain:

 id     select_type     table   type    possible_keys   key     key_len     ref     rows    Extra   
1   SIMPLE  mt_ex_15    range   bdate   bdate   3   NULL    4494019     Using where; Using index

Upvotes: 0

Views: 64

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1271003

For this query:

SELECT s, count(*) as total
FROM  `mt_ex_15` 
WHERE bdate > '2014-10-01' and bdate < '2014-11-01' and itype = '3'
group by s
order by total desc

The best index is mt_ex_15(itype, bdate, s). The engine should be able to take full advantage of the index for the where clause. In addition, this is a covering index so the original data does not need to touched for this query.

If you had a list of all available "s" values, you could do this as a correlated subquery:

select s.*,
       (select count(*)
        from mt_ex_15 m
        where m.s = s.s and m.itype = 3 and m.bdate > '2014-10-01' and m.bdate < '2014-11-01'
       ) total
from s
having total > 0 -- using a convenient MySQL extension
order by total desc;

The best index for this query is mt_ex_15(s, itype, bdate).

Note: if itype is really an integer, you should remove the quotes around the constant. They are misleading.

Upvotes: 1

Rick James
Rick James

Reputation: 142528

GROUP BY s ORDER BY total -- You are stuck with at least one "filesort". Depending on various things, the sort may actually be in RAM.

An off-the-wall suggestion:

  • Change to GROUP BY itype, s -- The unnecessary field in the GROUP BY may lead to a better EXPLAIN.
  • INDEX(itype, s, bdate) -- in that order

If you are using MySQL 5.6.16 or later, ALTER TABLE ... ALGORITHM = INPLACE will be less invasive.

If bdate is a DATE, then bdate > '2014-10-01' eliminates Oct. 1; was that intentional?

Upvotes: 0

spencer7593
spencer7593

Reputation: 108500

Use EXPLAIN to see the execution plan.

Lacking any information about the table, we're really just guessing.

I'd try achieving the specified result like this:

CREATE INDEX `mt_ex_15_IX1` ON `mt_ex_15` (`itype`,`s`,`bdate`);

SELECT t.s
     , SUM(t.bdate > '2014-10-01' AND t.bdate < '2014-11-01') AS `total`
  FROM `mt_ex_15` t
 WHERE t.itype = '3'
 GROUP BY t.s
HAVING `total` > 0
 ORDER BY t.s DESC

Comparing the EXPLAIN output from this and from the original will (likely) show that the two queries are using different execution plans.

FOLLOWUP

With a suitable index, MySQL can avoid an expensive "Using filesort" operation. The index I recommended above will render the index on just the itype column redundant, and that index could be dropped. (Any query that was making use of that index can make use of the new index, since itype is the leading column.

The recommendation for the new index is based on the query... an equality predicate on itype (make that column the leading column), followed by s since there's a GROUP BY on that column. Including the bdate column in the index means that the query can be satisfied from the index, without a lookup to the underlying data page.

We'd expect the EXPLAIN output "Extra" column to show "Using index", and not show "Using filesort".

If adding an index is out of the question, then your best shot at avoiding a "Using filesort" is going to be to make use of an existing index that has column s as the leading column. But that means that the query is going to need to examine every row in the table; if the columns bdate and itype aren't included in the index, then that means an index lookup to every row in the table. But, this may perform faster. Check the output from EXPLAIN for this query:

EXPLAIN
SELECT t.s
     , SUM(t.itype = '3' AND t.bdate > '2014-10-01' AND t.bdate < '2014-11-01') 
       AS `total`
  FROM `mt_ex_15` t
 GROUP BY t.s
HAVING `total` > 0
 ORDER BY t.s DESC

Upvotes: 0

Related Questions