quibuz
quibuz

Reputation: 41

How to speed up left join queries by indexing?

At the moment I am experiencing some slower MySQL queries in my application which I want to speed up. Unfortunately I’m not quite sure which is the correct way to do it.

I have the following (fictitious) tables: Book, Page and Word.
Word is child of Page by word_page_id
Page is child of Book by page_book id

I already have individual indexes on page_book_id, word_page_id, book_user_id and book_flag_delete.

SELECT `book`.*, COUNT(word_id) AS `word_amount` FROM `book` 
LEFT JOIN `page` ON page_book_id = book_id 
LEFT JOIN `word` ON word_page_id = paragraph_id 
WHERE (book_user_id = 1) AND (book_flag_delete IS NULL) 
GROUP BY `book_id` 
ORDER BY `book_id` ASC LIMIT 100

SELECT COUNT(DISTINCT `book_id`) AS `book_row_count` FROM `book` 
LEFT JOIN `page` ON page_book_id = book_id 
LEFT JOIN `word` ON word_page_id = page_id 
WHERE (book_user_id = 59) AND (book_flag_delete IS NULL)

Any ideas how to speed up such queries? Is there extra indexing involved?

Upvotes: 3

Views: 7926

Answers (4)

sjas
sjas

Reputation: 19667

Set indexes on the fields you use for joining.

Further make sure that these have both the same datatype, encoding, and collation, else the index will also not be used.

mysql> EXPLAIN <query> will show you the actually used fields (key column in output) and the available indexes (possible_keys output field).

Upvotes: 4

Rick James
Rick James

Reputation: 142298

The optimal schema for a many-to-many mapping table is

CREATE TABLE XtoY (
    # No surrogate id for this table
    x_id MEDIUMINT UNSIGNED NOT NULL,   -- For JOINing to one table
    y_id MEDIUMINT UNSIGNED NOT NULL,   -- For JOINing to the other table
    # Include other fields specific to the 'relation'
    PRIMARY KEY(x_id, y_id),            -- When starting with X
    INDEX      (y_id, x_id)             -- When starting with Y
) ENGINE=InnoDB;

The details on 'why' are in my index cookbook

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269753

For this query:

SELECT b.*, COUNT(w.word_id) AS `word_amount`
FROM `book` b LEFT JOIN
     `page` p
     ON p.page_book_id = b.book_id LEFT JOIN
     `word` w
     ON w.word_page_id = p.paragraph_id 
WHERE (b.book_user_id = 1) AND (b.book_flag_delete IS NULL) 
GROUP BY b.`book_id` 
ORDER BY b.`book_id` ASC
LIMIT 100;

The best indexes are: book(user_id, book_flag_delete, book_id), page(page_book_id, paragraph_id), and word(word_page_id, word_id).

However, the overall group by might be expensive. You might try writing the query as:

SELECT b.*,
       (SELECT COUNT(w.word_id) 
        FROM `page` p JOIN
             `word` w
             ON w.word_page_id = p.paragraph_id 
        WHERE p.page_book_id = b.book_id
       ) AS `word_amount`
FROM `book` b LEFT JOIN         
WHERE (b.book_user_id = 1) AND (b.book_flag_delete IS NULL) 
ORDER BY b.`book_id` ASC
LIMIT 100;

The same indexes indexes work here. But, this query should avoid a group by on all the data at once (instead, it uses the indexes for the aggregation).

Upvotes: 1

FirebladeDan
FirebladeDan

Reputation: 1069

In your select you're gonna want to refrain from using the wildcard "*" to grab columns. Plus utilize aliases ALWAYS!! This will keep your db from having to create a "virtual" alias.

select book1.column1, book1.column2, page1.column1 
from book book1 
left join page page1 
on page1.page_book_id = book1.book_id
..... blah

Upvotes: -2

Related Questions