Keelah
Keelah

Reputation: 302

Generate unique ID from string in C#

I need my app to handle a list of mods from a database and a list of locally downloaded mods that aren't. Each mod of the database has a unique uint ID that I use to identify him but local mods don't have any ID.

At first I tried to generate an ID with string.GetHashCode() by using the mod's name but GetHashCode is still randomized at each run of the app. Is there any other way to generate a persistent uint ID from the mod's name ?

Current code :

foreach(string mod in localMods)
{
    //This way I get a number between 0 and 2147483648
    uint newId = Convert.ToUInt32(Math.Abs(mod.GetHashCode());
    ProfileMod newMod = new ProfileMod(newId);
}

Upvotes: 0

Views: 8461

Answers (3)

Just Shadow
Just Shadow

Reputation: 11871

The method GetHashCode() doesn't return the same value for the same string, especially if you re-run the application. It has a different purpose (like checking the equality during runtime, etc.).
So, it shouldn't be used as a unique identifier.

If you'd like to calculate the hash and get consistent results, you might consider using the standard hashing algorithms like MD5, SHA256, etc. Here is a sample that calculates SHA256:

using System;
using System.Security.Cryptography;
using System.Text;

public class Program
{
    public static void Main()
    {
        string input = "Hello World!";
        // Using the SHA256 algorithm for the hash.
        // NOTE: You can replace it with any other algorithm (e.g. MD5) if you need.
        using (var hashAlgorithm = SHA256.Create())
        {
            // Convert the input string to a byte array and compute the hash.
            byte[] data = hashAlgorithm.ComputeHash(Encoding.UTF8.GetBytes(input));

            // Create a new Stringbuilder to collect the bytes
            // and create a string.
            var sBuilder = new StringBuilder();

            // Loop through each byte of the hashed data
            // and format each one as a hexadecimal string.
            for (int i = 0; i < data.Length; i++)
            {
                sBuilder.Append(data[i].ToString("x2"));
            }

            // Return the hexadecimal string.
            var hash = sBuilder.ToString();

            Console.WriteLine($"The SHA256 hash of {input} is: {hash}.");
        }
    }
}

Though SHA256 produces longer result than MD5, the risk of the collisions are much lower. But if you still want to have smaller hashes (with a higher risk of collisions), you can use MD5, or even CRC32.

P.S. The sample code is based on the one from the Microsoft's documentation.

Upvotes: 5

Efthymios Kalyviotis
Efthymios Kalyviotis

Reputation: 969

I wouldn't trust any solution involving hashing or such. Eventually you will end-up having conflicts in the IDs especially if you have huge amount of records on your DB.

What I would prefer to do is to cast the int ID of the DB to a string when reading it and then use some function like Guid.NewGuid().ToString() to generate a string UID for the local ones.

This way you will not have any conflict at all.

I guess that you will have to employ some kind of such strategy.

Upvotes: 1

Keelah
Keelah

Reputation: 302

So I ended up listening to your advises and found a good answer in another post by using SHA-1

private System.Security.Cryptography.SHA1 hash = new System.Security.Cryptography.SHA1CryptoServiceProvider();

private uint GetUInt32HashCode(string strText)
{
    if (string.IsNullOrEmpty(strText)) return 0;
    
    //Unicode Encode Covering all characterset
    byte[] byteContents   = Encoding.Unicode.GetBytes(strText);
    byte[] hashText       = hash.ComputeHash(byteContents);
    uint   hashCodeStart  = BitConverter.ToUInt32(hashText, 0);
    uint   hashCodeMedium = BitConverter.ToUInt32(hashText, 8);
    uint   hashCodeEnd    = BitConverter.ToUInt32(hashText, 16);
    var    hashCode       = hashCodeStart ^ hashCodeMedium ^ hashCodeEnd;
    return uint.MaxValue - hashCode;
} 

Could probably be optimized but it's good enough for now.

Upvotes: 1

Related Questions