Reputation: 45
Can anybody explain why this statement returns the same output?
SELECT CHECKSUM('2019-03-14 00:15:32')
SELECT CHECKSUM('2019-03-13 00:15:42')
output:
843611027
Upvotes: 3
Views: 1078
Reputation: 8111
From the documentation:
If at least one of the values in the expression list changes, the list checksum will probably change. However, this is not guaranteed. Therefore, to detect whether values have changed, we recommend use of CHECKSUM only if your application can tolerate an occasional missed change. Otherwise, consider using HASHBYTES instead. With a specified MD5 hash algorithm, the probability that HASHBYTES will return the same result, for two different inputs, is much lower compared to CHECKSUM.
You've hit a collision. Have a look at Hashbytes
.
Upvotes: 3
Reputation: 17720
I'm not going to answer this specifically, but more generally in terms of the utility of a checksum process.
There are a few different checksum algorithms, of varying quality. The most basic checksums (like the one you're showing) are very fast to run, but have a high collision rate (the likelihood that two different strings will have the same output).
If you need to be very confident of a low collision rate, use a "cryptographically secure" checksum algorithm, which has been mathematically proven to have a very low collision rate. A good example is SHA256.
Basic checksums are good for error detection (i.e. did the machine read a credit card number or barcode correctly) but not for tampering detection (i.e. did someone mess with this important message) or identification (is this the right password).
Upvotes: 3