Reputation: 11298
My query is currently taking roughly 3 seconds, which I'm sure can be optimized. I just can't figure out how to optimize it.
My app has a reasonably big products
table (roughly 500,000 records). Each product can be listed on one of 50 domains (listed in a domains
table). The links between products and domains are stored in the domains_products
table (which has approximately 1,400,000 records). The slow query is in my app's admin section, where I need to be able to see products that are NOT listed on any domain.
Stripped to the bare bones with all unrelated joins removed, the query in question is:
SELECT `products`.*
FROM `products`
LEFT JOIN `domains_products`
ON `domains_products`.`product_id` = `products`.`id`
WHERE `products`.`deleted` = 'N'
AND `domains_products`.`domain_id` IS NULL
ORDER BY `products`.`id` ASC
In this form, the query takes more than 3 seconds and returns a little over 3,000 products (which is correct). If I remove either WHERE
clause, the query takes 0.12 seconds (but obviously does not return the correct results).
Both tables use the InnoDB engine. The products
table has a primary key on the id
column and an index on the deleted
column. The domains_products
table only has a product_id
and domain_id
column, the primary key is on both these columns and they both have their own index. All relevant columns are NOT NULL
columns.
EXPLAIN
gives me this:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE products ref deleted deleted 1 const 188616 Using where
1 SIMPLE domains_products ref product_id product_id 4 products.id 1 Using where; Using index; Not exists
Note that although MySQL has discovered the correct keys, it doesn't actually seem to be using them.
The profiler says this:
Status Time
Starting 62 µs
Checking Permissions 7 µs
Checking Permissions 5 µs
Opening Tables 38 µs
System Lock 13 µs
Init 37 µs
Optimizing 17 µs
Statistics 1,3 ms
Preparing 25 µs
Executing 5 µs
Sorting Result 5 µs
Sending Data 3,3 s
End 28 µs
Query End 8 µs
Closing Tables 25 µs
Freeing Items 297 µs
Logging Slow Query 4 µs
Cleaning Up 5 µs
Note that it seems to be hanging on Sending Data
. I've tried replacing the join by a NOT IN:
SELECT `products`.*
FROM `products`
WHERE `products`.`deleted` = 'N'
AND `product`.`id` NOT IN (
SELECT `product_id`
FROM `domains_products`
)
ORDER BY `products`.`id` ASC
This query gives the exact same results, but takes 3.8 seconds.
Can anyone point me in the right direction to optimize this query?
Upvotes: 2
Views: 216
Reputation: 4053
Try this, and let me know the time it is taking.
SELECT `products`.*
FROM `products`
WHERE `products`.`deleted` = 'N'
AND NOT EXISTS (SELECT 1
FROM `domains_products`
WHERE `domains_products`.`product_id` = `products`.`id`
);
ORDER BY `products`.`id` ASC
Upvotes: 0
Reputation: 126
It seems that the problem is with the "deleted" column. I'm guessing that almost all of the items in the products table is marked with "N", making the index on the "deleted" column pretty useless in this case.
One thing you can do is create another table, say deleted_domains_products that would store the product_id (and the domain_id if you want). Then you create a trigger so every time an entry was deleted from domains_products, it would insert an entry into that table. Then you'll have a smaller set to query against. And when you're done, you can truncate that table for the next time, so it should always be pretty quick.
Upvotes: 1
Reputation: 262
Try to create the following indexes and then rerun the query:
Tell us how it goes this
Upvotes: 0