Reputation: 3602
In SQL server, we can easily generate a hash on an entire row using "checksum":
select
checksum(*), *
from mytable
Is there a way to do the same thing using the hashbytes function? Something like:
select
hashbytes('md5', *), *
from mytable
Upvotes: 2
Views: 2948
Reputation: 28789
If writing out the columns is no objection, you can simply concatenate the columns and hash that:
SELECT HASHBYTES('md5',
ISNULL(CONVERT(VARBINARY(MAX), column1), 0x) +
ISNULL(CONVERT(VARBINARY(MAX), column2), 0x) +
...
)
From SQL Server 2012 onwards, we can achieve the same more compactly using CONCAT
(thanks to @MWillemse for the tip):
SELECT HASHBYTES('md5', CONCAT(column1, column2, ...))
Be aware that this can cause collisions if your columns are "the same": hashing AA,BB,CC
yields the same result as AAB,BBC,C
or even AABBCC,NULL,NULL
. If this is a concern, you'll have to design your own hashing scheme to minimize the odds of this (by including the column names in the hash, for example). Also, this is a binary checksum: hashing A,B,C
is not the same as a,b,c
. Again, if this is a concern you'll have to tweak the data before hashing it (UPPER
).
Finally, MD5 is no longer considered cryptographically secure, so only use this for things like checksum indexes where you need less collision probability than with CHECKSUM
. If you need a cryptographic signature of a row, this is too simplistic.
Upvotes: 1
Reputation: 980
Not easily I'm afraid. And in most cases you shouldn't really want to. But if you must, generating a XML document for each row and hashing the XML document is probably the easiest way to go:
SELECT HASHBYTES('MD5', X.value)
FROM Data
CROSS APPLY (
SELECT value =
CAST((SELECT Data.* FROM (SELECT 1) AS X (y) FOR XML AUTO) AS VARCHAR(MAX))
) AS X
EDIT: I've did some testing, applying the CAST(... AS VARCHAR(MAX))
as I've done in the updated statement above improves performance about 15%. More importantly, it prevents truncation errors on wide tables. Concatenating the columns yourself is even after this fix still about 15% faster. Using CONCAT() (available from MSSQL 2012) function makes your live a little easier though as you can omit applying ISNULL and CAST functions to each column.
Upvotes: 2