dsign
dsign

Reputation: 12700

Can I optimize this query?

This very simple query is taking a lot of time:

SELECT text, url, docid 
FROM retrieve 
LEFT JOIN citations2 
ON citations2.fromdoc = retrieve.docid 
WHERE citations2.todoc IS NULL 
LIMIT 10;

It has a left join with a NULL condition... could that be the cause? I have put indices everywhere.

Here is the slow-query log:

# Time: 130404  8:00:31
# User@Host: em[em] @ zebra [130.239.162.142]
# Query_time: 27.006579  Lock_time: 0.000019 Rows_sent: 0  Rows_examined: 90682
use em_bg04;
SET timestamp=1365055231;
SELECT text, url, docid FROM retrieve LEFT JOIN citations2 ON citations2.fromdoc =   retrieve.docid WHERE citations2.todoc IS NULL LIMIT 10;

Here is a schematic view of the involved tables, together with the sizes (keep scrolling down to see the output of EXPLAIN for the query)

Table citations2

Table retrieve

Here's the output of EXPLAIN:

enter image description here

So it seems that it has to go through the whole table... .I of course read this, but I can't make sense of it. So, is there anyway of making this query faster?

Upvotes: 2

Views: 55

Answers (2)

Jürgen Steinblock
Jürgen Steinblock

Reputation: 31723

Meherzads query looks promissing. But I would use his query in combination with a newly created index

IDX_FROMDOC_TODOC which contains both columns.

Upvotes: 1

Meherzad
Meherzad

Reputation: 8553

Try this query

You are taking advantage of short circuit, so if first condition goes false it wont go for checking the second condition..

Hope it helps...

SELECT 
   text, 
   url, 
   docid 
FROM  
   retrieve 
LEFT JOIN 
   citations2 
ON 
   citations2.todoc IS NULL AND 
   citations2.fromdoc = retrieve.docid 
LIMIT 10;

Upvotes: 2

Related Questions