Reputation: 79549
I've two tables in SQLite:
Table1:
-------
id
name
Table2:
-------
id
temp_name
My question is, how do I write an SQL query that returns names in Table2
that are not in Table1
?
For example:
Table1:
-------
1, 'john'
2, 'boda',
3, 'cydo',
4, 'linus'
Table2:
-------
1123, 'boda'
2992, 'andy',
9331, 'sille',
2, 'cydo'
In this example the SQL query should return elements andy
, and sille
from Table2
, because they're not in Table1
.
Upvotes: 7
Views: 6816
Reputation: 308
I was curious about which of the options would work best for my use case and thought it might be of help to others if I shared my results:
In short, select where not in is the speediest (for me!). It's also worth noting that using all options returned duplicates except for except.
Option 0: SELECT {c2} FROM {t2} WHERE {c2} not in (SELECT {c1} FROM {t1});
Entries returned: 1098, Unique entries returned: 357
Average: 1.8680 seconds
Entries returned: 0, Unique entries returned: 0
Average: 0.6664 seconds
Option 1: SELECT {c2} FROM {t2} EXCEPT SELECT {c1} FROM {t1};
Entries returned: 357, Unique entries returned: 357
Average: 3.9455 seconds
Entries returned: 0, Unique entries returned: 0
Average: 3.3074 seconds
Option 2: SELECT {t2}.{c2} FROM {t2} LEFT OUTER JOIN {t1} ON {t1}.{c1} = {t2}.{c2} WHERE {t1}.{c1} IS null;
Entries returned: 1098, Unique entries returned: 357
Average: 2.3330 seconds
Entries returned: 0, Unique entries returned: 0
Average: 1.1982 seconds
Option 3: SELECT {c2} FROM {t2} WHERE NOT EXISTS (SELECT 1 FROM {t1} WHERE {c1} = {t2}.{c2});
Entries returned: 1098, Unique entries returned: 357
Average: 2.6945 seconds
Entries returned: 0, Unique entries returned: 0
Average: 0.9737 seconds
Here's the code I used to run the numbers:
import sqlite3
import timeit
# Database path here
database = "database.db"
# Your table and column names here
t1, c1 = 'Table_1', 'name'
t2, c2 = 'Table_2', 'temp_name'
# Reverse the test
dbs = [{'t1': t1, 'c1':c1, 't2': t2, 'c2': c2},
{'t1': t2, 'c1':c2, 't2': t1, 'c2': c1}]
commands = ["SELECT {c2} FROM {t2} WHERE {c2} not in (SELECT {c1} FROM {t1});",
"SELECT {c2} FROM {t2} EXCEPT SELECT {c1} FROM {t1};",
"SELECT {c2} FROM {t2} LEFT OUTER JOIN {t1} ON {t1}.{c1} = {t2}.{c2} WHERE {t2}.{c2} IS null;",
"SELECT {c2} FROM {t2} WHERE NOT EXISTS (SELECT 1 FROM {t1} WHERE {c1} = {t2}.{c2});",]
for i, c in enumerate(commands):
print("Option {}: {}".format(i, c))
for db in dbs:
co = c.format(**db)
foo = sqlite3.connect(database).execute(co).fetchall()
# Sanity check that entries have been found and how many
print("\tEntries returned: {}, Unique entries returned: {}".format(len(foo), len({a[0] for a in foo})))
# Reconnect to the database each time - I can't remember if there's any caching
t = timeit.repeat(lambda: sqlite3.connect(database).execute(co).fetchall(), repeat=5, number=1)
print('\tAverage: {:.4f} seconds'.format(statistics.mean(t)))
Upvotes: 5
Reputation: 7881
select name
from table2
except
select temp_name
from table1
Upvotes: 3
Reputation: 1271231
This is how to do it in "obvious" standard SQL:
select *
from table2
where temp_name not in (select name from table1)
There are other methods, such as using left outer join
, exists
in the where
clause, and the except
operation.
Upvotes: 15
Reputation: 180310
The EXISTS
method mentioned by Gordon:
SELECT *
FROM Table2
WHERE NOT EXISTS (SELECT 1
FROM Table1
WHERE Table1.name = Table2.temp_name)
Upvotes: 1
Reputation:
The left join version:
select t2.* from Table2 t2
left outer join Table1 t1 on t1.name = t2.temp_name
where t2.temp_name is null
Upvotes: 2