Reputation: 537
I'm looking for a hash/checksum algorithm that is as fast as possible while still being able to detect changes to a 4096 byte memory page. Because the size is fixed I hope it should be possible to attain an optimized algorithm exactly for this purpose, since the size is guaranteed to not change.
What I'm doing is checksumming some memory pages, doing an operation, then checksumming the pages again to see if they've changed. Due to space reasons it is not feasible to simply compare bytewise with a copy of the old bytes. I don't need know where within the page the change occurred, just if a change happened, so comparing the checksums is enough.
I've tried CRC32 in hardware and xxHash, both with good results. Still, as far as I know they are not tailored for a fixed size buffer.
Edit: Here is the code I'm using for CRC32 in hardware. For some reason it's slower than xxHash.
// Warning! Not padding, so don't use if length isn't dividable by sizeof(uint32_t).
uint32_t sse42_crc32_32bit(const uint32_t* buffer, const uint32_t length)
{
uint32_t crc = 0;
const uint32_t numRounds = length / sizeof(uint32_t);
for (uint32_t i = 0; i < numRounds; ++i)
{
crc = _mm_crc32_u32(crc, buffer[i]);
}
return crc;
}
Upvotes: 2
Views: 1630
Reputation: 29952
There is farmHash128, which is faster than xxHash64. I'm not sure about its quality, though.
If you use xxHash64, I think that you can unroll it a little bit (for example, 8 times), and it will be a little bit faster. And if the page is not in cache, prefetching may help.
Note, however, this
"If I miss one bit of change it's game over."
is a risky game to play. xxHash64's 64-bit of output is surely inadequate for this. You will surely have hash collisions. I'd say that you'll need to use a 256-bit hash at least for this purpose. It won't detect changes 100%, but close. The usual wisdom is to use hash size which has a lower collision probability than the probability of system failure (multiplied by 10^-X, where X is a "smallish" positive number, like 5).
Upvotes: 2