Reputation: 3
I need to check two tables and find inconsistencies, ie where the value of table T1 is not present in the italy_cities table. I'll explain:
T1: Includes personal data (with place of birth) italy_city: Includes all the municipalities of Italy.
Table T1 has about 9000 tuples. T2 has 7,903 tuples.
Using "NOT IN" the query takes approximately 16 seconds to execute.
Here is the query:
SELECT
`T1`.*
FROM
T1
WHERE
(
`T1`.place NOT IN ( SELECT municipality FROM italy_cities )
)
MY QUESTION
what is the best and fast option to check for inconsistencies? to check all the "incorrect" municipalities that do not exist in the official database?
Thanks in advance
Upvotes: 0
Views: 295
Reputation: 35900
Not exists
can perform better but there is also another way which is left join
as follows:
SELECT T1.*
FROM T1
LEFT JOIN italy_cities I ON I.municipality = T1.PLACE
WHERE I.municipality IS NULL;
Upvotes: 1
Reputation: 1269773
I generally recommend NOT EXISTS
for this purpose:
SELECT T1.*
FROM T1
WHERE NOT EXISTS (SELECT 1
FROM italy_cities ic
WHERE t1.place = ic.municipality
);
Why? There are two reasons:
NOT IN
does not do what you expect if the subquery returns any NULL
values. If even one value is NULL
all rows end up being filtered out.italy_cities(municipality)
which seems like a reasonable index on the table.Upvotes: 1