\n
Here's the code should anyone wish to try to reproduce it (note it's not consistent: it's just occasional)
\nusing System.Security.Cryptography;\nusing System.Text;\n\nnamespace Domain.Models.Object\n{\n /// <summary>\n /// One-way identifier\n /// </summary>\n public record ObjectIdentifier\n {\n private static SHA384 hash = SHA384.Create();\n\n public ObjectIdentifier(string name)\n {\n if (name == null)\n {\n throw new ArgumentNullException(nameof(name));\n }\n var bytes = Encoding.UTF8.GetBytes(name);\n\n\n Id = Convert.ToBase64String(hash.ComputeHash(bytes));\n\n int x = 0;\n x++;\n }\n\n public string Id { get; init; }\n\n public override string ToString()\n {\n return Id;\n }\n }\n}\n
\nAnd here's the test:
\n[Fact]\npublic void ObjectIdentifier_AsExpected()\n{\n // arrange\n var obj = new ObjectIdentifier("A");\n\n // assert\n Assert.Equal("rRSq8lAgvvL9Tj617AxQJyzf1mB0sO0DfJoRJUMhqsBymYU3S+6qW4ClBNBIvhhk", obj.Id);\n}\n
\nI also note that the new hash value is not consistently the same: here's another failure with a different output from the previous screenshot:
\n\nI also note that adding this to line 20 causes the inconsistency to stop happening altogether...\nUnfortunately, this is not a suitable fix :P
\nDebug.Assert(bytes.Length == 1 && bytes[0] == 65)\n
\nUpdate\nThe invalid outputs appear to just be the two supplied above, I haven't observed any further variants.
\nAlso, changing it to be a class (instead of a record) makes no difference.
\nI'm also observing this effect in a test console app which has only two identifiers supplied to it, but in fact more than two hashes are output:
\n\n","author":{"@type":"Person","name":"Nathan"},"upvoteCount":0,"answerCount":1,"acceptedAnswer":{"@type":"Answer","text":"So, as Matt kindly pointed out in the comments, this operation isn't thread-safe. The shared worker object (shared because it was static) was to blame. An easy one to miss - and a great example of why unit tests are a must!
\nI opted to use an object pool to get a unique worker per instantiation while allowing for some reduction in allocation overhead:
\nusing Domain.Implementations;\nusing Microsoft.Extensions.ObjectPool;\nusing System.Text;\n\nnamespace Domain.Models.Object\n{\n /// <summary>\n /// One-way identifier\n /// </summary>\n public class ObjectIdentifier\n {\n private static ObjectPool<HashWrapper> hashObjects = ObjectPool.Create<HashWrapper>();\n\n public ObjectIdentifier(string name)\n {\n if (name == null)\n {\n throw new ArgumentNullException(nameof(name));\n }\n\n var hashworker = hashObjects.Get();\n try\n {\n Id = Convert.ToBase64String(hashworker.Value.ComputeHash(Encoding.UTF8.GetBytes(name)));\n }\n finally\n {\n hashObjects.Return(hashworker);\n } \n }\n\n public string Id { get; init; }\n\n public override string ToString()\n {\n return Id;\n }\n }\n}\n
\nThe hash wrapper class (wraps the worker and provides a public constructor for the object pool)
\ninternal class HashWrapper\n{\n public HashWrapper()\n {\n Value = SHA384.Create();\n }\n\n public HashAlgorithm Value { get; private set; }\n}\n
\nUpdate
\nThis implementation with a shared pool provided a (very minor) increase in speed and memory allocation when constructing higher numbers of objects. For smaller numbers, there is a (very minor) performance penalty. YMMV - see below.
\nThis benchmark ran a version with Shared = true (normal lock on a shared object), false (new sha worker instance for each instantiation) and null [? in the column] : to signal using an object pool
\n\n","author":{"@type":"Person","name":"Nathan"},"upvoteCount":2}}}Reputation: 6216
I have a class representing a unique real-world object, its name is personally identifiable information (a vehicle license plate if you're interested) so as a basic first step I am hashing the name and using that instead. (I know - salt required etc. - this is just a foundation)
I have a test which instantiates the object with a fixed input (a test name - "A") and asserts that the Id (Base 64 string of the hash output) is as expected. However, it occasionally fails(!!)
I dug a bit deeper and here's a screenshot of a conditional breakpoint which breaks only when the hash output isn't the norm. The input is still as expected (the 'bytes' variable contains { 65 }, but the output is different from the normal sha384 output hash (normally it's "rRSq8lAgvvL9Tj617AxQJyzf1mB0sO0DfJoRJUMhqsBymYU3S+6qW4ClBNBIvhhk")
Lines 19-25 are split up a bit to facilitate debugging, but otherwise this class is as it should be.
Any clues as to how this is possible would be very welcome. Running Windows 11, using .NET 7 and the latest version of Visual Studio Enterprise.
Here's a pic of a conditional breakpoint being hit where hash output is not the norm:
Here's the code should anyone wish to try to reproduce it (note it's not consistent: it's just occasional)
using System.Security.Cryptography;
using System.Text;
namespace Domain.Models.Object
{
/// <summary>
/// One-way identifier
/// </summary>
public record ObjectIdentifier
{
private static SHA384 hash = SHA384.Create();
public ObjectIdentifier(string name)
{
if (name == null)
{
throw new ArgumentNullException(nameof(name));
}
var bytes = Encoding.UTF8.GetBytes(name);
Id = Convert.ToBase64String(hash.ComputeHash(bytes));
int x = 0;
x++;
}
public string Id { get; init; }
public override string ToString()
{
return Id;
}
}
}
And here's the test:
[Fact]
public void ObjectIdentifier_AsExpected()
{
// arrange
var obj = new ObjectIdentifier("A");
// assert
Assert.Equal("rRSq8lAgvvL9Tj617AxQJyzf1mB0sO0DfJoRJUMhqsBymYU3S+6qW4ClBNBIvhhk", obj.Id);
}
I also note that the new hash value is not consistently the same: here's another failure with a different output from the previous screenshot:
I also note that adding this to line 20 causes the inconsistency to stop happening altogether... Unfortunately, this is not a suitable fix :P
Debug.Assert(bytes.Length == 1 && bytes[0] == 65)
Update The invalid outputs appear to just be the two supplied above, I haven't observed any further variants.
Also, changing it to be a class (instead of a record) makes no difference.
I'm also observing this effect in a test console app which has only two identifiers supplied to it, but in fact more than two hashes are output:
Upvotes: 0
Views: 237
Reputation: 6216
So, as Matt kindly pointed out in the comments, this operation isn't thread-safe. The shared worker object (shared because it was static) was to blame. An easy one to miss - and a great example of why unit tests are a must!
I opted to use an object pool to get a unique worker per instantiation while allowing for some reduction in allocation overhead:
using Domain.Implementations;
using Microsoft.Extensions.ObjectPool;
using System.Text;
namespace Domain.Models.Object
{
/// <summary>
/// One-way identifier
/// </summary>
public class ObjectIdentifier
{
private static ObjectPool<HashWrapper> hashObjects = ObjectPool.Create<HashWrapper>();
public ObjectIdentifier(string name)
{
if (name == null)
{
throw new ArgumentNullException(nameof(name));
}
var hashworker = hashObjects.Get();
try
{
Id = Convert.ToBase64String(hashworker.Value.ComputeHash(Encoding.UTF8.GetBytes(name)));
}
finally
{
hashObjects.Return(hashworker);
}
}
public string Id { get; init; }
public override string ToString()
{
return Id;
}
}
}
The hash wrapper class (wraps the worker and provides a public constructor for the object pool)
internal class HashWrapper
{
public HashWrapper()
{
Value = SHA384.Create();
}
public HashAlgorithm Value { get; private set; }
}
Update
This implementation with a shared pool provided a (very minor) increase in speed and memory allocation when constructing higher numbers of objects. For smaller numbers, there is a (very minor) performance penalty. YMMV - see below.
This benchmark ran a version with Shared = true (normal lock on a shared object), false (new sha worker instance for each instantiation) and null [? in the column] : to signal using an object pool
Upvotes: 2