Generating an identifier for objects so that they can be added to a hashtable I have created

Question

I have a hashtable base class and I am creating different type of hashtable by deriving from it. I only allow it to accept objects that implement my IHashable interface.For example -

class LinearProbingHashTable : HashTableBase where T: IHashable
{
...
...
...
}

interface IHashable
{
    /**
     * Every IHashable implementation should provide an indentfying value for use in generating a hash key.
     */
    int getIdentifier();
}

class Car : IHashable
{
    public String Make { get; set; }
    public String Model { get; set; }
    public String Color { get; set; }
    public int Year { get; set; }

    public int getIdentifier()
    {
        /// ???
    }
}

Can anyone suggest a good method for generating an identifier for the car that can be used by the hash function to place it in the hash table?

I am actually really looking for a general purpose solution to generating an id for any given class. I would like to have a base class for all classes, HashableObject, that implements IHashable and its getIdentifier method. So then I could just derive from HashableObject which would automatically provide an identifier for any instances. Which means I wouldn't have to write a different getIdentifier method for every object I add to the hashtable.

public class HashableObject : IHashable
{
  public int getIdentifier()
  {
    // Looking for code here that would generate an id for any object...
  }
}

public class Dog : HashableObject
{
  // Dont need to implement getIdentifier because the parent class does it for me
}

MiMo · Accepted Answer

I would split the problem in two:

How to generate hash codes of primitive types: strings, integers etc.
How to combine multiple hash codes into one hash code

using (1) and then (2) you can generate the hash code of any class or structure.

The naive way to do (1) for strings is to add the code of all characters in the string:

public static int getStringIdentifier(string str)
{
   int result = 0;
   foreach (char c in str) {
     result += (int)c;
   }
   return result;
}

Similar naive algorithms can be used for other basic data types (that are all array of bytes in the end..).

The naive way to do (2) is to simply combine the various hash codes with XOR:

public int getIdentifier() 
{ 
  return getStringIdentifier(Make) ^ getStringIdentifier(Model) ^ getStringIdentifier(Color);     
}

These algorithms will work, but won't generate good distributions of the hash code values - i.e. there will be collisions.

If you want better algorithms you can have a look at how the .NET framework does it - here is the source code of the class used intenally to combine multiple hash codes, and here is the source code of the String class - including String.GetHashCode().

As you can see they are variants of the naive one above, with different starting values and more complex combinations.

If you want a single method that works on different classes the way to do it is to use reflection to detect all the primitive fields contained in the class, compute their hash code using the primitive functions and then combine them. It is tricky and extermely .NET-specific though - my preference would be to create methods handling the primitive types and then just re-define getIdentifier() for each class.

Generating an identifier for objects so that they can be added to a hashtable I have created

Answers (2)

Related Questions