Reputation: 941
I encountered something very strange in Microsoft SQL Server and basically it's something to do with the CHECKSUM_AGG(BINARY_CHECKSUM(*))
function.
Let's say that I have 2 different tables where the content is like this:
As you can see, each table only has 2 possible row contents:
However, they're still different because the second table has 5 of these row combinations but if I try to calculate the CHECKSUM_AGG(BINARY_CHECKSUM(*))
of the 2 tables by running
SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) AS "Table 1 Checksum" FROM Table_1;
SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*))AS "Table 2 Checksum" FROM Table_2;
They will display the same result:
This is very weird and I don't know why this is happening. I'm doing the CHECKSUM_AGG
function to see if 2 tables have the same content and so far it looks like it's working quite well. However, in such rare cases where the 2 tables have similar content like those two above ^^, I'm afraid that the function will return the same result for the 2 tables.
Can someone please explain the reason behind this and if there's any way to mitigate this issue?
Thanks in advance and I would really appreciate any help :)
Upvotes: 3
Views: 1565
Reputation: 4095
BINARY_CHECKSUM
is checking if the values are the same but doesn't consider the number of times a value might occur on one table compared to the other. If the number of times the value occur is important and you wanted to detect there not the same because though they have the same values one table does have more entires of the values then the other, you can use HASHBYTES
to detect there difference.
BINARY_CHECKSUM satisfies the properties of a hash function: when applied >over any two lists of expressions, returns the same value if the corresponding >elements of the two lists have the same type and are equal when compared >using the equals (=) operator.
Example
SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) AS "Table 1 Checksum",
HashBytes('md5',convert(varbinary(max),(SELECT * FROM Table_1 FOR XML AUTO))) AS "Table 1 Hashbytes",
FROM Table_1;
SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*))AS "Table 2 Checksum",
HashBytes('md5',convert(varbinary(max),(SELECT * FROM Table_2 FOR XML AUTO))) AS "Table 2 Hashbytes",
FROM Table_2;
Upvotes: 1