mmswe
mmswe

Reputation: 707

Optimizing primary id relation table

tag_relation table has tag_id and comment_id fields only and both of them are indexed. (there is no primary) it has InnoDB type.

Following query takes long time to execute. How can I make it faster?

All comment_id, tag_id, status, datetime fields are indexed. I really have no idea how to optimize it further.

SELECT
    text
FROM comment
INNER JOIN tag_relation
    ON tag_relation.comment_id=comment.comment_id
WHERE tag_id='1022278'
AND status=1
ORDER BY comment.datetime DESC LIMIT 0,20

Main cause of slowness is tag_relation table which has 1.5 million records. When it has less records execution time was faster.

Query plan:

Upvotes: 1

Views: 74

Answers (3)

koriander
koriander

Reputation: 3258

I think the problem is in the "status" field. Although it is indexed, the index is not being used. It says "using where" for that table. You can force the use of the index for status but I'm not sure it will be useful, depending on selectivity, i.e., how many different values can "status" take. Alternatively, the documentation says that if "status" allows for NULL then you'll see the "using where". Does it allow for NULLs? If so, consider restricting it.

I just noticed that I overlooked the "ORDER BY", comment.datetime will need an index.

If you already have an index, then try a subquery:

SELECT     text
FROM       tag_relation
INNER JOIN (SELECT c.comment_id, c.text, c.datetime
            FROM comment c
            WHERE c.status = 1) comment
        ON tag_relation.comment_id = comment.comment_id
WHERE tag_id='1022278'
ORDER BY comment.datetime DESC LIMIT 0,20

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270391

This is your query:

SELECT c.text
FROM comment c INNER JOIN
     tag_relation tr
     ON tr.comment_id = c.comment_id
WHERE t.tag_id = 1022278 AND c.status = 1
ORDER BY c.datetime DESC
LIMIT 0, 20;

First, notice that I removed the single quotes from the value 1022278. If this is really a number, the single quotes can sometimes confuse SQL optimizers. There are two ways to go about optimizing this query, depending on the selectivity of the various conditions. The first is to have the indexes:

tag_relation(tag_id, comment_id)
comment(comment_id, status, datetime, text)

The second is a covering index for comments, and the most important part is the comment_id column.

The second is:

comment(status, comment_id, datetime)
tag_relation(comment_id, tag_id)

The basic issue is which table is scanned first for the join. Using this index, the query would be processed as:

SELECT c.text
FROM comment c INNER JOIN
     tag_relation
WHERE c.status = 1 AND
      EXISTS (SELECT 1
              FROM tag_relation tr
              WHERE tr.comment_id = c.comment_id AND tr.tag_id = 1022278 
             )
ORDER BY c.datetime DESC
LIMIT 0, 20;

I'm not 100% sure that this avoids the file sort on the result set, but it might work.

Upvotes: 3

Lennart - Slava Ukraini
Lennart - Slava Ukraini

Reputation: 7181

If I get it right you have one index for tag_id and another index for comment_id. Try creating an composite index like:

create index ... on tag_relation(tag_id, comment_id)

This will make the index with tag_id redundant so it can be dropped.

AFAIK MySQL cannot do index anding, but even if it could a composite index would be more efficient.

Upvotes: 2

Related Questions