Reputation: 14350
I'm looking at Bouncy Castle to see how the performance of its hash algorithms compare to that of the .NET Framework, and it doesn't look too great; the MD5 implementation is around 6x slower than .NET's, and the SHA256 implementation is around 3x slower than .NET's.
So I want to make sure I'm using Bouncy Castle correctly, since the documentation is virtually non-existent. Here's what I'm doing:
using Org.BouncyCastle.Crypto;
using Org.BouncyCastle.Crypto.Digests;
public byte[] Hash(string filename)
{
IDigest hash = new Sha256Digest();
byte[] result = new byte[hash.GetDigestSize()];
using (var fs = new FileStream(filename, FileMode.Open, FileAccess.Read,
FileShare.Delete | FileShare.ReadWrite))
{
byte[] buffer = new byte[4092];
int bytesRead;
while ((bytesRead = fs.Read(buffer, 0, buffer.Length)) > 0)
{
hash.BlockUpdate(buffer, 0, bytesRead);
}
hash.DoFinal(result, 0);
}
return result;
}
EDIT
For comparison, here's how I'm doing it with .NET:
public byte[] Hash(string filename)
{
byte[] hashBytes;
HashAlgorithm hash = new SHA256CryptoServiceProvider();
using (var fs = new FileStream(filename, FileMode.Open, FileAccess.Read,
FileShare.Delete | FileShare.ReadWrite))
{
try
{
hashBytes = hash.ComputeHash(fs);
}
finally
{
hash.Clear();
}
}
return hashBytes;
}
Upvotes: 2
Views: 8495
Reputation: 323
Seems like you're using it correctly.
You're also right about the performance differences between both implementations. My latest testing presented the Bouncy Castle MD5 hashing on c# nuget is ~X2 slower than the .NET hashing.
Upvotes: 1
Reputation: 341
I'm aware this question is quite old, but currently I'm able to obtain the same speed for both .NET and Bouncy Castle's MD5 algorithm implementation.
However, instead of computing the hash while reading the file, I'm reading the full file contet in a previous step and then hashing it:
var md5Digest = new MD5Digest();
var hash = new byte[md5Digest.GetDigestSize()];
md5Digest.BlockUpdate(buffer, 0, buffer.Length);
md5Digest.DoFinal(hash, 0);
// Once used, mark buffer content to be garbage collected.
buffer = null;
(I'm plenty aware that storing the full file content in memory may not be very convenient).
Upvotes: 0
Reputation: 3872
While it is possible that the two algorithms are so vastly different that you experience a differential of 3-6x, it is also likely that the issue is a result of an I/O difference. By passing in a FileStream to the .NET implementation, it is possible that it is doing some clever things internally to achieve better I/O performance (such as hashing and reading concurrently) that you are not doing in your Bouncy Castle example.
To test this you can either:
Make your two examples as similar as possible (this is what I would do). You can use TransformBlock and TransformFinalBlock in the .NET HashAlgorithm that will be more similar to your Bouncy Castle test.
Try to make I/O optimizations to your Bouncy Castle code and see if you can approach the performance of the .NET implementation.
This may be moot though. If the .NET implementation meets your needs, it may be the best fit for your application. It seems that it may already have some performance characteristics built in that you would have to add manually to the Bouncy Castles implementation.
Upvotes: 1
Reputation: 1502935
The first thing you should check is whether you're IO-bound or CPU-bound. If you're CPU-bound, then I suspect that is Bouncy Castle. If you're IO-bound, it could be that .NET is being smarter about the IO. To start with, you might want to increase your buffer size from 4K to (say) 64K. Just give it a try. That's a really easy change. A harder change would be to use async IO so that you're reading the next buffer's-worth of unhashed data while you're hashing the data you've already got.
Upvotes: 3