trentcroad
trentcroad

Reputation: 79

MySQL - Improving performance of a slow subquery

I have a large database of horse racing results. A normal query is very fast to run, however when using a subquery, performance is incredibly slow to the point it's not viable.

The purpose of the subquery is to determine the results of a given horse, prior to the date of the race. Hence the where clause.

I would like to improve performance and as I understand it using a join is much faster, however I haven't been able to get it to work successfully.

I would love to know how to join this particular query?

I apologize for not adding a SQL Fiddle, however the sheer size of the data required, even for a sample, is just too large.

select date, raceref, horse, rank, 

 (
SELECT ifnull(count(raceref),0)
FROM results
WHERE horse = t.horse
  AND date < t.date
   ) AS totalracesprior,

FROM results t
group by horse, raceref
order by raceref, horse

The output from EXPLAIN:

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   PRIMARY t   ALL NULL    NULL    NULL    NULL    110088  Using where; Using temporary; Using filesort
2   DEPENDENT SUBQUERY  results ALL NULL    NULL    NULL    NULL    110088  Using where

Indexes:

Table   Non_unique  Key_name    Seq_in_index    Column_name Collation   Cardinality Sub_part    Packed  Null    Index_type  Comment Index_comment
results 0   PRIMARY 1   id  A   110088  NULL    NULL        BTREE       

Upvotes: 0

Views: 71

Answers (2)

Rick James
Rick James

Reputation: 142208

Are there multiple rows with the same combination of horse, raceref? If so, the GROUP BY horse, raceref will not work as intended.

If there are not dups, then get rid of the GROUP BY; it adds work without changing the results.

I recommend two indexes: INDEX(horse, date) is useful for the subquery. Without the GROUP BY, INDEX(raceref, horse) is useful for the ORDER BY. It would be useful also for the GROUP BY if you swapped the order there.

Upvotes: 1

Solarflare
Solarflare

Reputation: 11096

You absolutely need an index for the column horse.

You can either use the index results(horse) or the index results(horse, date, raceref). The latter would provide full coverage of your subquery and probably yield better results, but it depends a little bit on your data if you would see a difference.

Upvotes: 0

Related Questions