Reputation: 6809
I have a table that maintains a "Gold Standard" set of data that another table should match if the table was processed correctly.
Both of these tables have almost 1,000,000 records of data.
For example. I have table (table1) that have PrimaryKey1, ColumnA, ColumnB, ColumnC, ColumnD, and Column E.
I have another table (table2) with ForeignKey1, ColumnF, ColumnG, ColumnH, ColumnI, ColumnJ.
I need to check that all the data in these two table are exactly the same except for a few columns.
What I mean by that is that ColumnA from table1 has to have all of the same as columnF in table2, and ColumnC from table1 has to matchup with ColumnI from table2 FOR THE SAME RECORD (lets call this primaryKey1). The other columns in the table do not matter.
Also, if there is a mismatch between the datasets, I need to know where the mismatch is.
Upvotes: 1
Views: 3365
Reputation: 2120
You could use symetric difference for this
(select 'table1', col
from table1
UNION ALL
select 'table2', col
from table2)
EXCEPT
(select 'table1', col
from table1
INTERSECT
select 'table2', col
from table2)
This query returns only those rows that are only in one table and it says in which table it was found
Upvotes: 0
Reputation: 93
I think the following will helps you to get the unmatched records.
select * from table1 where not exists (select * from table2);
so instead of all columns you can check with the columns what you need from the two tables,but i think the column names should be same.
Thank you.
Upvotes: 0
Reputation: 3996
Something like
select
*
from
gold_copy a
join my_copy b on a.primary_key = b.primary_key
and
a.field1 <> b.field1
or a.field_a <> b.field_f
or a.field_c <> b.field_i
or a.field_x <> b.field_y
Upvotes: 0
Reputation: 980
A quick trick that I use is just comparing row counts. This will at least show you if you have a problem (it won't show you where the problem is).
A union query can join two queries together and display the combined result. Common rows are treated as 1 row. So, if the first query returns exactly 1 million rows, the UNION query (both queries combined) should return exactly 1 million rows. If it doesn't there is a problem.
select ColumnA 'Col1'
, ColumnC 'Col2'
from Table1
UNION
select ColumnF 'Col1'
, ColumnI 'Col2'
from TableB
Upvotes: 0
Reputation: 1317
I think your best bet is SUBSTRACT(). Select x, y, z from A substract select x,y,z from B. If it returns nothing, you're good to go.
Hope this helps!
Upvotes: 1