I'm trying to find the most recent entry time at a bunch of specific dates. When I run select max(ts) as maxts from factorprice where ts <= '2011-1-5' It returns very quickly. EXPLAIN gives select_type SIMPLE and "Select tables optimized away". But when I run select (select max(ts) from factorprice where ts <= dates.dt) as maxts, dates.dt from trends.dates where dates.dt in ('2011-1-6'); It takes a long time to return (~10 seconds). Explain gives: select_type=PRIMARY table=dates rows=506 Extra=Using where select_type=DEPENDENT SUBQUERY table=factorprice type=index possible_keys=PRIMARY key=PRIMARY keylen=8 rows=26599224 Extra=Using where; Using index This query also takes a long time (10 sec) select dt, max(ts) as maxts from factorprice as f inner join trends.dates as d where ts <= dt and dt in ('2011-1-6') group by dt; Explain gives: select_type=SIMPLE table=d type=ALL rows =509 Extra=Using where select_type=SIMPLE table=f type=range possible_keys=PRIMARY key=PRIMARY keylen=8 rows=26599224 Extra=Using where; Using index I'd like to do this same operation on many different dates. Is there a way I can do that efficiently?

Reputation: 3237

Fast query when executed individually, slow when nested

I'm trying to find the most recent entry time at a bunch of specific dates. When I run

select max(ts) as maxts from factorprice where ts <= '2011-1-5'

It returns very quickly.

EXPLAIN gives select_type SIMPLE and "Select tables optimized away".

But when I run

select (select max(ts) from factorprice where ts <= dates.dt) as maxts, dates.dt 
   from 
   trends.dates where dates.dt in ('2011-1-6');

It takes a long time to return (~10 seconds).

Explain gives:

select_type=PRIMARY table=dates rows=506 Extra=Using where
select_type=DEPENDENT SUBQUERY table=factorprice type=index possible_keys=PRIMARY key=PRIMARY keylen=8 rows=26599224 Extra=Using where; Using index

This query also takes a long time (10 sec)

select dt, max(ts) as maxts from factorprice as f inner join trends.dates as d
   where ts <= dt and dt in ('2011-1-6')
   group by dt;

Explain gives:

select_type=SIMPLE table=d type=ALL rows =509 Extra=Using where
select_type=SIMPLE table=f type=range possible_keys=PRIMARY key=PRIMARY keylen=8 rows=26599224 Extra=Using where; Using index

I'd like to do this same operation on many different dates. Is there a way I can do that efficiently?

Upvotes: 0

Answers (3)

bobwienholt

Reputation: 17610

Judging from your EXPLAIN output. This query is visiting 506 rows in the dates table and then for each of those rows, over 26 million rows in the factorprice table. 10 seconds to do all of that isn't too bad.

My guess is that you are inadvertently creating a CROSS JOIN situation where every row of one table is matched up with every row in another table.

Upvotes: 0

georgepsarakis

Reputation: 1957

This part of your SQL is a dependent query

select max(ts) from factorprice where ts <= dates.dt

which is executed for each row in the resultset. So the total time is approximately the time of the standalone query times the rows in the result set.

Upvotes: 0

Philippe Plantier

Reputation: 8073

It looks like this bug:

http://bugs.mysql.com/bug.php?id=32665

Maybe if you create an index on dates.dt, it will go away.

Upvotes: 1

Fast query when executed individually, slow when nested

Answers (3)

Related Questions