Reputation: 25
I'm writing a query to exclude a certain group of employees from a table. Let's say I have Table 1, transaction_details, that has information that I want to select. Table 2, employee_task_associations, maps each employee to a task that they were assigned to on a particular day. It has a field for the employee_id, and a field called map_id which is the id of the different tasks. This table is an associative table so that Tables 1 and 3 can have a many-to-many relationship. Table 3, employee_tasks, has a list of all tasks that an employee can have.
I have written this query, which is functional, but not optimized:
SELECT someInfo FROM transaction_details TD
WHERE TD.employee_id NOT IN
(SELECT employee_id from employee_task_associations ETA
WHERE map_id IN
(SELECT id FROM employee_tasks ET
WHERE ET.taskName = "The task I want to exclude"))
While this works, it will run multiple queries. I want to speed things up by replacing my nested NOT IN and IN statements with JOINS.
I know that I can replace the bottom four lines with the following:
SELECT employee_id FROM employee_task_assocations ETA
LEFT OUTER JOIN employee_tasks ET
ON ETA.map_id = ET.id
WHERE ET.taskName = "The task I want to exclude"
This will return a list of all ids of the employees that have had this task. I want to exclude these from my SELECT statement from transaction_details by using a JOIN instead of a subquery. I have tried using a LEFT OUTER JOIN WHERE ETA.id IS NULL, but this does not work. How can I use a JOIN to exclude certain employees in this case?
Upvotes: 0
Views: 30
Reputation: 48810
You seem to think that an outer join is more performant than a subquery, but that's not true. It all depends on the SQL planner, SQL optimizer, existing indexes, table stats, and definitively on the repertoire of data operators your database engine offers.
Also, you should consider that after parsing, the query enters the transformation phase where the database engine is free to rewrite your query in a more efficient way. This means that behind the scenes your query may actually be executed using outer joins, even if you write subqueries. After that the [rewritten] query enters the query planner, and later the SQL optimizer.
The only way of optimizing it is to get the execution plan of all query options and compare them.
Upvotes: 0
Reputation: 48139
You appear to be close on your initial query, but why not have the NOT IN query joined and get distinct employees... Something like
SELECT
TD.someInfo
FROM
transaction_details TD
WHERE
TD.employee_id NOT IN
(SELECT DISTINCT
employee_id
from
employee_task_associations ETA
JOIN employee_tasks ET
ON ETA.map_id = ET.ID
AND ET.taskName = "The task I want to exclude")
Upvotes: 1