Reputation: 1625

Unique ID for each class

I'm want a unique ID (preferably static, without computation) for each class implementation, but not instance. The most obvious way to do this is just hardcode a value in the class, but keeping the values unique becomes a task for an human and isn't ideal.

class Base 
{ 
    abstract int GetID();
}
class Foo: Base 
{ 
    int GetID() => 10; 
}
class Bar: Base 
{ 
    int GetID() => 20;
}

Foo foo1 = new Foo();
Foo foo2 = new Foo();
Bar bar  = new Bar();

foo1.GetID() == foo2.GetID();
foo1.GetID() != bar.GetID()

The class name would be an obvious unique identifier, but I need an int (or fixed length bytes). I pack the entire object into bytes, and use the id to know what class it is when I unpack it at the other end.

Hashing the class name every time I call GetID() seems needlessly process heavy just to get an ID number.

I could also make an enum as a lookup, but again I need to populate the enum manually.

EDIT: People have been asking important questions, so I'll put the info here.

Needs to be unique per class, not per instance (this is why the identified duplicate question doesn't answer this one).
ID value needs to be persistent between runs.
Value needs to be fixed length bytes or int. Variable length strings such as class name are not acceptable.
Needs to reduce CPU load wherever possible (caching results or using assembly based metadata instead of doing a hash each time).
Ideally, the ID can be retrieved from a static function. This means I can make a static lookup function that matches ID to class.
Number of different classes that need ID isn't that big (<100) so collisions isn't a major concern.

EDIT2:

Some more colour since people are skeptical that this is really needed. I'm open to a different approach.

I'm writing some networking code for a game, and its broken down into message objects. Each different message type is a class that inherits from MessageBase, and adds it's own fields which will be sent.

The MessageBase class has a method for packing itself into bytes, and it sticks a message identifier (the class ID) on the front. When it comes to unpacking it at the other end, I use the identifier to know how to unpack the bytes. This results in some easy to pack/unpack messages and very little overhead (few bytes for ID, then just class property values).

Currently I hard code an ID number in the classes, but it doesn't seem like the best way of doing things.

EDIT3: Here is my code after implementing the accepted answer.

public class MessageBase
{
    public MessageID id { get { return GetID(); } }

    private MessageID cacheId;
    private MessageID GetID()
    {
        // Check if cacheID hasn't been intialised
        if (cacheId == null)
        {
            // Hash the class name
            MD5 md5 = MD5.Create();
            byte[] md5Bytes = md5.ComputeHash(Encoding.UTF8.GetBytes(GetType().AssemblyQualifiedName));

            // Convert the first few bytes into a uint32, and create the messageID from it and store in cache
            cacheId = new MessageID(BitConverter.ToUInt32(md5Bytes, 0));
        }

        // Return the cacheId
        return cacheId;
    }
}

public class Protocol
{
    private Dictionary<Type, MessageID> messageTypeToId = new Dictionary<Type, int>();
    private Dictionary<MessageID, Type> idToMessageType = new Dictionary<int, Type>();
    private Dictionary<MessageID, Action<MessageBase>> handlers = new Dictionary<int, Action<MessageBase>>();

    public Protocol()
    {
        // Create a list of all classes that are a subclass of MessageBase this namespace
        IEnumerable<Type> messageClasses = from t in Assembly.GetExecutingAssembly().GetTypes()
                                           where t.Namespace == GetType().Namespace && t.IsSubclassOf(typeof(MessageBase))
                                           select t;

        // Iterate through the list of message classes, and store their type and id in the dicts
        foreach(Type messageClass in messageClasses)
        {
            MessageID = (MessageID)messageClass.GetField("id").GetValue(null);
            messageTypeToId[messageClass] = id;
            idToMessageType[id] = messageClass;
        }
    }
}

Upvotes: 3

Answers (3)

Adam G

Reputation: 1323

Here is one suggestion. I have used a sha256 byte array which is guaranteed to be a fixed size and astronomically unlikely to have a collision. That may well be overkill, you can easily substitute it out for something smaller. You could also use the AssemblyQualifiedName rather than FullName if you need to worry about version differences or the same class name in multiple assemblies

Firstly, here are all my usings

using System;
using System.Collections.Concurrent;
using System.Text;
using System.Security.Cryptography;

Next, a static cached type hasher object to remember the mapping between your types and the resulting byte arrays. You don't need the Console.WriteLines below, they are just there to demonstrate that you are not computing it over and over again.

public static class TypeHasher
{
    private static ConcurrentDictionary<Type, byte[]> cache = new ConcurrentDictionary<Type, byte[]>();
    public static byte[] GetHash(Type type)
    {
        byte[] result;
        if (!cache.TryGetValue(type, out result))
        {
            Console.WriteLine("Computing Hash for {0}", type.FullName);
            SHA256Managed sha = new SHA256Managed();
            result = sha.ComputeHash(Encoding.UTF8.GetBytes(type.FullName));
            cache.TryAdd(type, result);
        }
        else
        {
            // Not actually required, but shows that hashing only done once per type
            Console.WriteLine("Using cached Hash for {0}", type.FullName);
        }

        return result;
    }
}

Next, an extension method on object so that you can ask for anything's id. Of course if you have a more suitable base class, it doesn't need to go on object per se.

public static class IdExtension
{
    public static byte[] GetId(this object obj)
    {
        return TypeHasher.GetHash(obj.GetType());
    }
}

Next, here are some random classes

public class A
{
}

public class ChildOfA : A
{
}

public class B
{
}

And finally, here is everything put together.

public class Program
{
    public static void Main()
    {
        A a1 = new A();
        A a2 = new A();
        B b1 = new B();
        ChildOfA coa = new ChildOfA();
        Console.WriteLine("a1 hash={0}", Convert.ToBase64String(a1.GetId()));
        Console.WriteLine("b1 hash={0}", Convert.ToBase64String(b1.GetId()));
        Console.WriteLine("a2 hash={0}", Convert.ToBase64String(a2.GetId()));
        Console.WriteLine("coa hash={0}", Convert.ToBase64String(coa.GetId()));
    }
}

Here is the console output

Computing Hash for A
a1 hash=VZrq0IJk1XldOQlxjN0Fq9SVcuhP5VWQ7vMaiKCP3/0=
Computing Hash for B
b1 hash=335w5QIVRPSDS77mSp43if68S+gUcN9inK1t2wMyClw=
Using cached Hash for A
a2 hash=VZrq0IJk1XldOQlxjN0Fq9SVcuhP5VWQ7vMaiKCP3/0=
Computing Hash for ChildOfA
coa hash=wSEbCG22Dyp/o/j1/9mIbUZTbZ82dcRkav4olILyZs4=

On the other side, you would use reflection to iterate all of the types in your library and store a reverse dictionary of hash to type.

Upvotes: 3

Jon Skeet

Reputation: 1500035

Given that you can get a Type by calling GetType on the instance, you can easily cache the results. That reduces the problem to working out how to generate an ID for each type. You'd then call something like:

int id = typeIdentifierCache.GetIdentifier(foo1.GetType());

... or make GetIdentifier accept object and it can call GetType(), leaving you with

int id = typeIdentifierCache.GetIdentifier(foo1);

At that point, the detail is all in the type identifier cache.

A simple option would be to take a hash (e.g. SHA-256, for stability and making it very unlikely that you'll encounter collisions) of the fully-qualified type name. To prove that you have no collisions, you could easily write a unit test that runs over all the type names in the assembly and hashes them, then checks there are no duplicates. (Even that might be overkill, given the nature of SHA-256.)

This is all assuming that the types are in a single assembly. If you need to cope with multiple assemblies, you may want to hash the assembly-qualified name instead.

Upvotes: 2

ForeverZer0

Reputation: 2496

Have not seen you answer the question if the same value needs to persist between different runs, but if all you need is a unique ID for a class, then use the built-in and simple GetHashCode method:

class BaseClass
{
    public int ClassId() => typeof(this).GetHashCode();
}

If you are worried about performance of multiple calls to GetHashCode(), then first, don't, that is ridiculous micro-optimization, but if you insist, then store it.

GetHashCode() is fast, that is its entire purpose, as a fast way to compare values in a hash.

EDIT: After doing some tests, the same hash code is returned between different runs using this method. I did not test after altering the classes, though, I am not aware of the exact method on how a Type is hashed.

Upvotes: 0

Unique ID for each class

Answers (3)

Related Questions