Reputation: 1300
I would like to find a way to improve a query but it seems i've done it all. Let me give you some details.
Below is my query :
SELECT
`u`.`id` AS `id`,
`p`.`lastname` AS `lastname`,
`p`.`firstname` AS `firstname`,
COALESCE(`r`.`value`, 0) AS `rvalue`,
SUM(`rat`.`category` = 'A') AS `count_a`,
SUM(`rat`.`category` = 'B') AS `count_b`,
SUM(`rat`.`category` = 'C') AS `count_c`
FROM
`user` `u`
JOIN `user_customer` `uc` ON (`u`.`id` = `uc`.`user_id`)
JOIN `profile` `p` ON (`p`.`id` = `u`.`profile_id`)
JOIN `ad` FORCE INDEX (fk_ad_customer_idx) ON (`uc`.`customer_id` = `ad`.`customer_id`)
JOIN `ac` ON (`ac`.`id` = `ad`.`ac_id`)
JOIN `a` ON (`a`.`id` = `ac`.`a_id`)
JOIN `rat` ON (`rat`.`code` = `a`.`rat_code`)
LEFT JOIN `r` ON (`r`.`id` = `u`.`r_id`)
GROUP BY `u`.`id`
;
Note : Some table and column names are voluntarily hidden.
Now let me give you some volumetric data :
user => 6534 rows
user_customer => 12 923 rows
profile => 6511 rows
ad => 320 868 rows
ac => 4505 rows
a => 536 rows
rat => 6 rows
r => 3400 rows
And finally, my execution plan :
My query does currently run in around 1.3 to 1.7 seconds which is slow enough to annoy users of my application of course ... Also fyi result set is composed of 165 rows.
Is there a way I can improve this ?
Thanks.
EDIT 1 (answer to Rick James below) : What are the speed and EXPLAIN when you don't use FORCE INDEX?
Surprisingly it gets faster when i don't use FORCE INDEX. To be honest, i don't really remember why i've done that change. I've probably found better results in terms of performance with it during one of my various tries and didn't remove it since.
When i don't use FORCE INDEX, it uses an other index ad_customer_ac_id_blocked_idx(customer_id, ac_id, blocked) and times are around 1.1 sec. I don't really get it because fk_ad_customer_idx(customer_id) is the same when we talk about index on customer_id.
Upvotes: 1
Views: 197
Reputation: 142298
Get rid of FORCE INDEX
. Even if it helped yesterday; it may hurt tomorrow.
Some of these indexes may be beneficial. (It is hard to predict; so simply add them all.)
a: (rat_code, id)
rat: (code, category)
ac: (a_id, id)
ad: (ac_id, customer_id)
ad: (customer_id, ac_id)
uc: (customer_id, user_id)
uc: (user_id, customer_id)
u: (profile_id, r_id, id)
(This assumes that id
is the PRIMARY KEY
of each table. Note that none have id
first.) Most of the above are "covering".
Another approach that sometimes helps: Gather the SUMs
before joining to any unnecessary table. But is seems that p
is the only table not involved in getting from u
(the target of GROUP BY
) to r
and rat
(used in aggregates). It would look something like:
SELECT ..., firstname, lastname
FROM ( everything as above except for `p` ) AS most
JOIN `profile` `p` ON (`p`.`id` = most.`profile_id`)
GROUP BY most.id
This avoids hauling around firstname and lastname while doing most of the joins and the GROUP BY
.
When doing JOINs
and GROUP BY
, be sure to sanity check the aggregates. Your COUNTs
and SUMs
may be larger than they should be.
Upvotes: 6
Reputation: 48139
First, you don't need to tick
.everyTableAndColumn
in your queries, nor result columns, aliases, etc. The tick
marks are used primarily when you are in conflict with a reserved work so the parser knows you are referring to a specific column... like having a table with a COLUMN named "JOIN", but JOIN is part of SQL command... see the confusion it would cause. Helps clean readability too.
Next, and this is just personal preference and can help you and others following behind you on data and their relationships. I show the join as indented from where it is coming from. As you can see below, I see the chain on how do I get from the User (u alias) to the rat alias table... You get there only by going 5 levels deep, and I put the first table on the left-side of the join (coming from table) then = the table joining TO right-side of join.
Now, that I can see the relationships, I would suggest the following. Make COVERING indexes on your tables that have the criteria, and id/value where appropriate. This way the query gets as best it needs, the data from the index page vs having to go to the raw data. So here are suggestions for indexes.
table index
user_customer ( user_id, customer_id ) -- dont know what your fk_ad_customer_idx parts are)
ad ( customer_id, ac_id )
ac ( id, a_id )
a (id, rat_code )
rat ( code, category )
Reformatted query for readability and seeing relationships between the tables
SELECT
u.id,
p.lastname,
p.firstname,
COALESCE(r.value, 0) AS rvalue,
SUM(rat.category = 'A') AS count_a,
SUM(rat.category = 'B') AS count_b,
SUM(rat.category = 'C') AS count_c
FROM
user u
JOIN user_customer uc
ON u.id = uc.user_id
JOIN ad FORCE INDEX (fk_ad_customer_idx)
ON uc.customer_id = ad.customer_id
JOIN ac
ON ad.ac_id = ac.id
JOIN a
ON ac.a_id = a.id
JOIN rat
ON a.rat_code = rat.code
JOIN profile p
ON u.profile_id = p.id
LEFT JOIN r
ON u.r_id = r.id
GROUP BY
u.id
Upvotes: 1