niroice
niroice

Reputation: 163

Creating Guid using a hash of a string

I'm toying with the idea of using a Guid as a PrimaryKey in a noSQL database thats the combination of three different properties (its probably a bad idea). These three properties are; two integers and a DateTime - they are unique when combined. The reason I'm using a Guid is because preexisting data of same structure uses the Guid instead of the these properties to lookup data.

If I convert them to strings and concat them. Then I convert to byte[] and create a Guid. What are the chances of a collision? I assume the hashing will be the problem here? If I use a weak 16byte hashing algorithm such as MD5 what are the chance of two guid matching (collision) if properties are different; eg integers and datetime? What happens if I use a hashing algorithm like SHA256 and just used the first 16 bytes instead of MD5? Are the odds of collision still the same?

Otherwise I have other options such as a secondary lookup if required but this doubles the writes, reads and cost.

Example:

        public static Guid GenerateId(int locationId, int orderNumber, DateTime orderDate)
    {
        var combined = $"{locationId}{orderNumber}{orderDate.ToString("d", CultureInfo.InvariantCulture)}";
        using (MD5 md5 = MD5.Create())
        {
            byte[] hash = md5.ComputeHash(Encoding.Default.GetBytes(combined));
            return new Guid(hash);
        }
    }

Upvotes: 0

Views: 1283

Answers (1)

Gusman
Gusman

Reputation: 15161

Why hashing at all? If you are totally sure those three parameters combined are always unique then you have all the data you need to create a unique GUID. DateTime is 8 bytes long, int is 4 bytes long, so your data is 16 bytes long, and that's the exact size of a GUID. You can use BitConverter to get the bytes of those values and use the GUID's constructor that takes a 16 byte array:

DateTime firstValue = DateTime.Now; //Or whatever it is
int secondValue = 33; //whatever
int thirdValue = 44;  //whatever

List<byte> tempBuffer = new List<byte>();

tempBuffer.AddRange(BitConverter.GetBytes(firstValue.ToBinary())); //Needs to convert to long first with ToBinary
tempBuffer.AddRange(BitConverter.GetBytes(secondValue));
tempBuffer.AddRange(BitConverter.GetBytes(thirdValue));

Guid id = new Guid(tempBuffer.ToArray());

Upvotes: 3

Related Questions