Reputation: 622
Searched around a bit, but I didn't really find what I wat looking for.
I have to validate about 100 byte[16384]'s every second (+ many other tasks..). The biggest problem that looks around the corner is speed.
Do you guys know any good checksum algorithm within C#.NET that is insanely fast? It does not have to be very exact, but if a single bit changes, the checksum should (usually..) change as well.
The byte's are stored in memory, so there's no IO stuff which slows it down.
Thanks!
Upvotes: 2
Views: 5898
Reputation: 16513
Expanding on C.Evenhuis's answer, here's some variations that should be quite a bit faster. I'm unsure though of their correctness, anybody with more bit-fiddling experience wanna help me out? I know they don't give the same checksum as the per-byte one, but I do think they give a checksum that's as good (not very, but apparently sufficient) as the per-byte one.
As I said in the comment, you could improve the speed a lot by not comparing byte-per-byte, but treating the array as a 4 times smaller array of ints, or an 8 times smaller array of longs. Treating it as a long[]
only provides a performance benefit on 64-bit though.
static unsafe uint ChecksumInt(byte[] array)
{
unchecked
{
uint checksum = 0;
fixed (byte* ptr = array)
{
var intPtr = (uint*)ptr;
var iterations = array.Length / 4;
var remainderIterations = array.Length % 4;
for (var i = 0; i < iterations; i++)
{
var val = intPtr[i];
checksum += val;
}
while (remainderIterations >= 0) // no more than 3 iterations
{
checksum += ptr[array.Length - remainderIterations];
remainderIterations--;
}
return checksum;
}
}
}
static unsafe ulong ChecksumLong(byte[] array)
{
unchecked
{
ulong checksum = 0;
fixed (byte* ptr = array)
{
var intPtr = (ulong*)ptr;
var iterations = array.Length / 8;
var remainderIterations = array.Length % 8;
for (var i = 0; i < iterations; i++)
{
var val = intPtr[i];
checksum += val;
}
while (remainderIterations >= 0) // no more than 7 iterations
{
checksum += ptr[array.Length - remainderIterations];
remainderIterations--;
}
return checksum;
}
}
}
My performance measurements on 64-bit (Core 2 Duo 3 GHz) for an array of 100,000 items over 10,000 iterations:
So quite a bit faster.
But, like I said, I don't know for sure if this provides an equally good checksum.
Upvotes: 5
Reputation: 26446
If each single bit matters, the checksum algorithm would have to process each and every byte. A simple algorithm is simply adding each value and ignoring overflow:
static unsafe uint GetChecksum(byte[] array)
{
unchecked
{
uint checksum = 0;
fixed (byte* arrayBase = array)
{
byte* arrayPointer = arrayBase;
for (int i = array.Length - 1; i >= 0; i--)
{
checksum += *arrayPointer;
arrayPointer++;
}
}
return checksum;
}
}
Of course you may not detect all changes and get duplicates, but it may give you an indication on how a fast algorithm performs.
Upvotes: 1