Reputation: 1314
This is my sample query
Select table1.id
from table1
where table.id in (select table2.id
from table2
where table2.id in (select table3.id
from table3)
)
order by table1.id
limit 100
On checking the optimiser trace for the above query. Optimiser trace cost
As DUPLICATE-WEEDOUT cost is less, mysql took DUPLICATE-WEEDOUT strategy for the above query.
Seems everything good in join_optimization part right. But finally, after checking the join_execution part. DUPLICATE-WEEDOUT usually creates temp table. But here as the heap-size is not enough for temp table, it went on creating ondisk temp table(converting_tmp_table_to_ondisk).
Due to disk temp table my query execution became slower.
So what happened here?
Optimiser trace doesn't calculate the cost of disk table in join-optimisation part itself. If disk table cost was calculated, it would be higher than first match. Then final_semijoin_strategy would be FIRST-MATCH strategy, with this my query would have been faster.
Is there any way MYSQL calculate the cost of disk table in join-optimisation part itself or any other work around for this particular issue?
MYSQ-5.7, INNODB
Note: This is a very dynamic query where multiple condition will add based on request in query. So I have done optimising the query in all possible manner. And finally stuck with this disk table cost issue. Kindly avoid optimising the query(like changing the query structure, forcing first-match strategy). And for increasing the heap size(Im not sure much about it, in different forum many said it might bring different issue in other queries)
Upvotes: 7
Views: 289
Reputation: 142366
IN( SELECT ... )
has been notoriously inefficient. Try to avoid it.
The query, as presented, is probably equivalent to
SELECT t1.id
FROM t1
JOIN t2 USING(id)
JOIN t3 USING(id)
ORDER BY id
LIMIT 100
which will optimize nicely.
This formulation should not need to build any temp table, much less a disk-based one.
Upvotes: 0