Reputation: 5228
I have the following query:
SELECT driver_id, first_name, last_name
FROM drivers
WHERE driver_id NOT IN
(SELECT DISTINCT w.driver_id from waybills w
JOIN drivers d ON d.driver_id = w.driver_id
WHERE w.waybill_owner = 1
AND w.waybill_status = 'dispatched'
AND w.delivery_date = '2014-10-28')
AND driver_id NOT IN
(SELECT DISTINCT wm.driver_id from waybill_movements wm
JOIN drivers d ON d.driver_id = wm.driver_id
WHERE wm.movement_owner = 1
AND wm.delivery_date = '2014-10-28')
AND status = 'active'
AND driver_owner = 1
ORDER BY last_name ASC
How can I optimize this query?
Query works well and return expected results but my question is if the query can be optimized.
Many thanks for your time and help.
Update:
And, yes, I have these indexes:
waybills(waybill_owner, waybill_status, w.delivery_date) waybill_movements (wm.movement_owner, delivery_date) drivers(driver_id a primary key and drivers(status, driver_owner)
The table structure is not need it for optimization
I didn't expect to have so many answers. Thank you all.
Upvotes: 2
Views: 59
Reputation: 48139
I have typically found that doing left-joins and looking for NULL of the table result easier and also great utilization of the indexes that can be made.
The drivers table I would have an index ON (driver_owner, status, driver_id)
Your waybill table, index ON(waybill_owner, driver_id, delivery_date, waybill_status)
waybill_movements, index ON(movement_owner, driver_id, delivery_date )
SELECT
d1.driver_id,
d1.first_name,
d1.last_name
FROM
drivers d1
LEFT JOIN waybills w
ON d1.driver_id = w.driver_id
AND d1.driver_owner = w.waybill_owner
AND w.waybill_status = 'dispatched'
AND w.delivery_date = '2014-10-28'
LEFT JOIN waybill_movements wm
ON d1.driver_id = wm.driver_id
AND d1.driver_owner = wm.movement_owner
AND wm.delivery_date = '2014-10-28'
where
d1.driver_owner = 1
AND d1.status = 'active'
AND w.driver_ID IS NULL
AND wm.driver_id IS NULL
ORDER BY
d1.last_name ASC
By including the "IS NULL" for the waybill and waybill_movements, you are only getting those that DO NOT have a matching record.
Also, seeing your waybill_owner = 1... is that coincidence to that of the driver_owner = 1 also? IF the waybill owner should always be the same as the driver owner, then I would change the index ON the waybill table to have waybill_owner first, then driver_id, then the rest AND change the joins to be based on the drivers.driver_owner = waybills.waybill_owner (similarly on waybill_movements)
REVISED per feedback
Revised the indexes and the query to join based on the same drivers.driver_owner table to the waybill tables to prevent false results of cross-owner matches by whatever strange possibility may exist.
Upvotes: 1
Reputation: 50034
It depends on what the definition of is
is. You may want to rewrite those WHERE IN
subqueries into your FROM
clause and then run your existing query against that, head to head, and see if MySQL explain gives a different execution path and stats for that.
No doubt it will since you pretty much have to hit your drivers
table three times in your existing query in order to get to the results you want. This is totally unnecessary overhead.
SELECT d.driver_id, d.first_name, d.last_name
FROM drivers d
LEFT OUTER JOIN waybills w ON
d.driver_id = w.driver_id AND
w.waybill_owner = 1 AND
w.waybill_status = 'dispatched' AND
w.delivery_date = '2014-10-28'
LEFT OUTER JOIN waybill_movements wm ON
d.driver_id = wm.driver_id AND
wm.movement_owner = 1 AND
wm.delivery_date = '2014-10-28')
WHERE
w.driver_id IS NULL AND
wm.driver_id IS NULL AND
d.status = 'active' AND
d.driver_owner = 1
ORDER BY last_name ASC
Your ORDER BY
is also expensive, as they always are. If it is not needed, then it might be a good idea to remove it.
You may want to run @StuartLC's suggested SQL and this one through explain and see MySQL likes either one of them better. You may find better results with his, but sometimes it's a coin flip. MySQL is generally not terribly speedy about derived tables, so cramming as much into a traditional join as possible can yield better results. But it totally depends on the size of the tables, whether or not they have proper indexes and all that fun stuff.
Upvotes: 2
Reputation: 107247
Assuming that you have already looked at the indexing
waybills(waybill_owner, waybill_status, w.delivery_date)
waybill_movements (wm.movement_owner, delivery_date)
drivers(driver_id)
- presumably a primary key? and possibly also drivers(status, driver_owner)
Two other improvements spring to mind
where driver_id IS NOT NULL
if this this is a nullable foreign keyNOT IN
SELECT driver_id, first_name, last_name
FROM drivers
WHERE driver_id NOT IN
(
SELECT DISTINCT w.driver_id
from waybills w
WHERE w.waybill_owner = 1
AND w.waybill_status = 'dispatched'
AND w.delivery_date = '2014-10-28'
UNION
SELECT DISTINCT wm.driver_id
from waybill_movements wm
WHERE wm.movement_owner = 1
AND wm.delivery_date = '2014-10-28'
)
AND status = 'active'
AND driver_owner = 1
ORDER BY last_name ASC
Upvotes: 1
Reputation: 490
SELECT d.driver_id, d.first_name, d.last_name
FROM drivers AS d
LEFT JOIN waybills AS w
ON d.driver_id = w.driver_id
AND w.waybill_owner = 1
AND w.waybill_status = 'dispatched'
AND w.delivery_date = '2014-10-28'
LEFT JOIN waybill_movements AS wm
ON d.driver_id = wm.driver_id
AND wm.movement_owner = 1
AND wm.delivery_date = '2014-10-28'
WHERE w.driver_id IS NULL
AND wm.driver_id IS NULL
AND d.status = 'active'
AND d.driver_owner = 1
ORDER BY d.last_name ASC
Upvotes: 0