Reputation: 84
I need to make a comparison between two (or more) tables with around 60.000 rows and about 60 columns.
In these tables there are two values on which I want to run a query. The purpose of the query is to count the rows which exists in TABLE_A but don't exist in TABLE_B based on two values in the row.
I've ran the following query:
SELECT id
FROM table_a ta
WHERE NOT EXISTS (
SELECT id
FROM table_b tb
WHERE ta.value1=tb.value1 AND ta.value2=tb.value2
)
As said, I've tried the code above and some variations on it. But to run this query it takes ages before it's finished. I hope to find a solution which runs in under 10 seconds.
Next query I tried, and of which I thought was working:
SELECT value1, value2
FROM (
SELECT ta.value1, ta.value2
FROM table_a ta
UNION ALL
SELECT tb.value1, tb.value2
FROM table_b tb
) result
GROUP BY value1, value2
HAVING COUNT(*) = 1
ORDER BY value1
The code shows me all differences between the two tables. So if valueX exists in TABLE_A but not in TABLE_B it's shown and vice versa.
So in short, I want to get all rows from TABLE_A which are not present in TABLE_B based on two values in the row.
Hope someone can help, thanks!
Upvotes: 1
Views: 181
Reputation: 84
After some trial and error I have improved the second block of code. I noticed an additional field in my table which I could use to further filter the results.
SELECT date, value1, value2
FROM (
SELECT date, value1, value2
FROM (
SELECT ta.date, ta.value1, ta.value2
FROM table_1 ta
UNION ALL
SELECT tb.date, tb.value1, tb.value2
FROM table_2 tb
) filter
GROUP BY value1, value2
HAVING COUNT(*) = 1
) result
WHERE date='YYYY-MM-DD'
This code filters the results in under 4 seconds.
Anyway, thanks everyone for the trouble.
Upvotes: 0
Reputation: 28732
Why not use a join?
/* Create a table called NAMES */
CREATE TABLE NAMES(Id integer PRIMARY KEY, Name text, LastName text);
CREATE TABLE OTHERNAMES(Id integer PRIMARY KEY, Name text, LastName text);
/* Create few records in this table */
INSERT INTO NAMES VALUES(1,'Tom','Riddle');
INSERT INTO NAMES VALUES(2,'Lucy','I love');
INSERT INTO NAMES VALUES(3,'Frank','Frankly');
INSERT INTO NAMES VALUES(4,'Jane','Austen');
INSERT INTO NAMES VALUES(5,'Robert','Downey');
INSERT INTO OTHERNAMES VALUES(2,'Lucy','I love');
INSERT INTO OTHERNAMES VALUES(3,'Frank','Frankly');
INSERT INTO OTHERNAMES VALUES(4,'Jane','Austen');
INSERT INTO OTHERNAMES VALUES(5,'Robert','Downey');
select * from NAMES
LEFT JOIN OTHERNAMES on
NAMES.Name = OTHERNAMES.Name
AND Names.LastName = OTHERNAMES.LastName
where OTHERNAMES.id is null
See it online http://sqlfiddle.com/#!9/640c53/1
If you use a LEFT JOIN
Items that don't exist in the right table will be replaced with null entries, which can be filtered with a where.
I don't know how efficient that is with your 60.000 database but this usually does the trick for me.
Upvotes: 3