Reputation: 35903

Hash function for indexed objects

Say, I have a class which indexes all objects that are created from it from 0, ..., n-1 (using a static counter of created objects). As these objects are used in HashSets and Dictionaries, we need a Hash function.

Is there any reason not to use this index as Hash value?

Upvotes: 2

Answers (3)

Scott Chamberlain

Reputation: 127603

Here is the actual code for Contains on a HashSet

private int[] m_buckets;
private Slot[] m_slots;

public bool Contains(T item) {
    if (m_buckets != null) {
        int hashCode = InternalGetHashCode(item);
        // see note at "HashSet" level describing why "- 1" appears in for loop
        for (int i = m_buckets[hashCode % m_buckets.Length] - 1; i >= 0; i = m_slots[i].next) {
            if (m_slots[i].hashCode == hashCode && m_comparer.Equals(m_slots[i].value, item)) {
                return true;
            }
        }
    }
    // either m_buckets is null or wasn't found
    return false;
}

private int InternalGetHashCode(T item) {
    if (item == null) {
        return 0;
    } 
    return m_comparer.GetHashCode(item) & Lower31BitMask;
}

internal struct Slot {
    internal int hashCode;      // Lower 31 bits of hash code, -1 if unused
    internal T value;
    internal int next;          // Index of next entry, -1 if last
}

The key things you want to notice is it calls GetHashCode() then it does hashCode % m_buckets.Length on the result to figure out which singularly linked list root stored in m_slots should it traverse.

The best possible algorithm will give you a even distribution of values across hashCode % m_buckets.Length so all linked lists will be the same length. Starting at 0 and counting up does this perfectly, so yes if you can get a fixed index for a object that is unique and just counts up that is a perfect hashcode.

Upvotes: 1

David

Reputation: 10708

One reason not use the index as a hash functions is because you want duplicates across disparate instances.

Say you're using a Dictionaty in an Entity system, and your keys are a combination of both Entity and Component type for any given Component. When looking up a Component, you want to be able to create a new key from Entity, Component Type, and have it equate to the key with the same Entity and Component Type. In this way, a statically incrementing index is not the way to go, since it would result in an object representing the same value having a different HashCode, resulting in it being useless as a key in a Dictionary.

Another reason is that you might have an arbitrarily huge number of objects over a type which is run in a program with an extended lifetime - let's say the transaction manager on a database driver. In such a case, you might actually run out of integer values (~4.2 billion values if you allow negatives or use an uint). In such a case, the hashcode is not enough to garuantee uniqueness - this is normal behaviour for hash codes, but a very possible gotcha for an overzealous optimization.

Upvotes: 0

Servy

Reputation: 203847

You certainly could use it, but if you did, it would mean that each separate object instance was considered a different object by those hash based structures. If you want different object instances to be able to be considered "equal" then this method wouldn't work.

If that is in fact your goal there's no reason to override the default equality/hash-code semantics at all. The default implementation will compare the object references, resulting in each object being "different" from every other object. So save yourself the effort and just don't bother doing anything.

Upvotes: 3

Hash function for indexed objects

Answers (3)

Related Questions