Reputation: 6216
I have a class representing a unique real-world object, its name is personally identifiable information (a vehicle license plate if you're interested) so as a basic first step I am hashing the name and using that instead. (I know - salt required etc. - this is just a foundation)
I have a test which instantiates the object with a fixed input (a test name - "A") and asserts that the Id (Base 64 string of the hash output) is as expected. However, it occasionally fails(!!)
I dug a bit deeper and here's a screenshot of a conditional breakpoint which breaks only when the hash output isn't the norm. The input is still as expected (the 'bytes' variable contains { 65 }, but the output is different from the normal sha384 output hash (normally it's "rRSq8lAgvvL9Tj617AxQJyzf1mB0sO0DfJoRJUMhqsBymYU3S+6qW4ClBNBIvhhk")
Lines 19-25 are split up a bit to facilitate debugging, but otherwise this class is as it should be.
Any clues as to how this is possible would be very welcome. Running Windows 11, using .NET 7 and the latest version of Visual Studio Enterprise.
Here's a pic of a conditional breakpoint being hit where hash output is not the norm:
Here's the code should anyone wish to try to reproduce it (note it's not consistent: it's just occasional)
using System.Security.Cryptography;
using System.Text;
namespace Domain.Models.Object
{
/// <summary>
/// One-way identifier
/// </summary>
public record ObjectIdentifier
{
private static SHA384 hash = SHA384.Create();
public ObjectIdentifier(string name)
{
if (name == null)
{
throw new ArgumentNullException(nameof(name));
}
var bytes = Encoding.UTF8.GetBytes(name);
Id = Convert.ToBase64String(hash.ComputeHash(bytes));
int x = 0;
x++;
}
public string Id { get; init; }
public override string ToString()
{
return Id;
}
}
}
And here's the test:
[Fact]
public void ObjectIdentifier_AsExpected()
{
// arrange
var obj = new ObjectIdentifier("A");
// assert
Assert.Equal("rRSq8lAgvvL9Tj617AxQJyzf1mB0sO0DfJoRJUMhqsBymYU3S+6qW4ClBNBIvhhk", obj.Id);
}
I also note that the new hash value is not consistently the same: here's another failure with a different output from the previous screenshot:
I also note that adding this to line 20 causes the inconsistency to stop happening altogether... Unfortunately, this is not a suitable fix :P
Debug.Assert(bytes.Length == 1 && bytes[0] == 65)
Update The invalid outputs appear to just be the two supplied above, I haven't observed any further variants.
Also, changing it to be a class (instead of a record) makes no difference.
I'm also observing this effect in a test console app which has only two identifiers supplied to it, but in fact more than two hashes are output:
Upvotes: 0
Views: 236
Reputation: 6216
So, as Matt kindly pointed out in the comments, this operation isn't thread-safe. The shared worker object (shared because it was static) was to blame. An easy one to miss - and a great example of why unit tests are a must!
I opted to use an object pool to get a unique worker per instantiation while allowing for some reduction in allocation overhead:
using Domain.Implementations;
using Microsoft.Extensions.ObjectPool;
using System.Text;
namespace Domain.Models.Object
{
/// <summary>
/// One-way identifier
/// </summary>
public class ObjectIdentifier
{
private static ObjectPool<HashWrapper> hashObjects = ObjectPool.Create<HashWrapper>();
public ObjectIdentifier(string name)
{
if (name == null)
{
throw new ArgumentNullException(nameof(name));
}
var hashworker = hashObjects.Get();
try
{
Id = Convert.ToBase64String(hashworker.Value.ComputeHash(Encoding.UTF8.GetBytes(name)));
}
finally
{
hashObjects.Return(hashworker);
}
}
public string Id { get; init; }
public override string ToString()
{
return Id;
}
}
}
The hash wrapper class (wraps the worker and provides a public constructor for the object pool)
internal class HashWrapper
{
public HashWrapper()
{
Value = SHA384.Create();
}
public HashAlgorithm Value { get; private set; }
}
Update
This implementation with a shared pool provided a (very minor) increase in speed and memory allocation when constructing higher numbers of objects. For smaller numbers, there is a (very minor) performance penalty. YMMV - see below.
This benchmark ran a version with Shared = true (normal lock on a shared object), false (new sha worker instance for each instantiation) and null [? in the column] : to signal using an object pool
Upvotes: 2