Best way to compare two tables containing more columns in SQL Server

Question

I have a existing table (VendorMaster) with 18 columns and around 2,00,000 rows in it. Now i have built new table (NewVendorMaster) with different logic to get same data under same schema.

Schema of VendorMaster and NewVendorMaster
V1       V2    V3     V4     V5      V6       V7      V8.....   V16
data1                data2          data3   data4   data5      dataN
type3  type16        data3 data4
type3        type17        data14           data18             type20
data4        type17        data14           type45             type20

Now i want to compare those two tables. As there are 18 columns and 2,00,000 records should i go for column wise comparison or concatenate all data into one column and then compare for faster result?

Join on these two tables seems to take hours to compare.

and also would creating index on both the tables would help for faster comparison?

deroby · Accepted Answer

I would advice against concatenating the fields before comparing but rather suggest to create a checksum value and work from there. I've done something similar and approached it like this: first I added an extra field to the table that holds the BINARY_CHECKSUM() of the key fields (you can use a calculated field for this!) and put an index on just that crc-field but INCLUDE() all the actual key fields. The generated values are not going to be unique, but it will be close enough to work. Mind that it might take a while for the index to be created, depending on the size it could take up quite a bit of space too.

When JOINing both tables join on the crc + on all the key fields (**). SQL will be quite good at matching just the integer to find the right row(s) and then comparing the other fields can be done from the included columns.

(**: don't rely on just the crc-value, BINARY_CHECKSUM() is fast and easy to use but will probably have collisions!)

Best way to compare two tables containing more columns in SQL Server

Answers (2)

Related Questions