Query optimization (multiple joins)

Question

I would like to find a way to improve a query but it seems i've done it all. Let me give you some details.

Below is my query :

SELECT 
    `u`.`id` AS `id`,
    `p`.`lastname` AS `lastname`,
    `p`.`firstname` AS `firstname`,
    COALESCE(`r`.`value`, 0) AS `rvalue`,
    SUM(`rat`.`category` = 'A') AS `count_a`,
    SUM(`rat`.`category` = 'B') AS `count_b`,
    SUM(`rat`.`category` = 'C') AS `count_c`
FROM
    `user` `u`
    JOIN `user_customer` `uc` ON (`u`.`id` = `uc`.`user_id`)
    JOIN `profile` `p` ON (`p`.`id` = `u`.`profile_id`)
    JOIN `ad` FORCE INDEX (fk_ad_customer_idx) ON (`uc`.`customer_id` = `ad`.`customer_id`)
    JOIN `ac` ON (`ac`.`id` = `ad`.`ac_id`)
    JOIN `a` ON (`a`.`id` = `ac`.`a_id`)
    JOIN `rat` ON (`rat`.`code` = `a`.`rat_code`)
    LEFT JOIN `r` ON (`r`.`id` = `u`.`r_id`)
GROUP BY `u`.`id`
;

Note : Some table and column names are voluntarily hidden.

Now let me give you some volumetric data :

user => 6534 rows
user_customer => 12 923 rows
profile => 6511 rows
ad => 320 868 rows
ac => 4505 rows
a => 536 rows
rat => 6 rows
r => 3400 rows

And finally, my execution plan :

My query does currently run in around 1.3 to 1.7 seconds which is slow enough to annoy users of my application of course ... Also fyi result set is composed of 165 rows.

Is there a way I can improve this ?

Thanks.

EDIT 1 (answer to Rick James below) : What are the speed and EXPLAIN when you don't use FORCE INDEX?

Surprisingly it gets faster when i don't use FORCE INDEX. To be honest, i don't really remember why i've done that change. I've probably found better results in terms of performance with it during one of my various tries and didn't remove it since.

When i don't use FORCE INDEX, it uses an other index ad_customer_ac_id_blocked_idx(customer_id, ac_id, blocked) and times are around 1.1 sec. I don't really get it because fk_ad_customer_idx(customer_id) is the same when we talk about index on customer_id.

Rick James · Accepted Answer

Get rid of FORCE INDEX. Even if it helped yesterday; it may hurt tomorrow.

Some of these indexes may be beneficial. (It is hard to predict; so simply add them all.)

a:  (rat_code, id)
rat:  (code, category)
ac:  (a_id, id)
ad:  (ac_id, customer_id)
ad:  (customer_id, ac_id)
uc:  (customer_id, user_id)
uc:  (user_id, customer_id)
u:  (profile_id, r_id, id)

(This assumes that id is the PRIMARY KEY of each table. Note that none have id first.) Most of the above are "covering".

Another approach that sometimes helps: Gather the SUMs before joining to any unnecessary table. But is seems that p is the only table not involved in getting from u (the target of GROUP BY) to r and rat (used in aggregates). It would look something like:

SELECT ..., firstname, lastname
    FROM ( everything as above except for `p` ) AS most
    JOIN `profile` `p`  ON (`p`.`id` = most.`profile_id`)
    GROUP BY most.id

This avoids hauling around firstname and lastname while doing most of the joins and the GROUP BY.

When doing JOINs and GROUP BY, be sure to sanity check the aggregates. Your COUNTs and SUMs may be larger than they should be.

Query optimization (multiple joins)

Answers (2)

Related Questions