Reputation: 1618
I'm developing a simple 2D environment and each object drawn e.g. line, rectangle and ... gets a unique id by calling GetHashCode()
Now, I noticed on MSDN page it doesn't guarantee its result would be unique:
The default implementation of the GetHashCode method does not guarantee unique return values for different objects. Furthermore, the .NET Framework does not guarantee the default implementation of the GetHashCode method, and the value it returns will be the same between different versions of the .NET Framework. Consequently, the default implementation of this method must not be used as a unique object identifier for hashing purposes.
Now, question is what other options do exist beside GetHashCode()
method?
Thanks, Amit
Upvotes: 0
Views: 810
Reputation: 45096
You will need to generate you own unique ID
Some times can derive a unique ID from object properties if your object has a natural key.
If the object does not have a natural key then must generate a unique ID and you would typically pass the unique ID to the object in the constructor.
GetHashCode is poor unique ID as it is not guaranteed to be unique.
Internally .NET does not use GetHashCode for uniqueness.
Internally .NET uses GetHashCode to speed equality comparison and for HashBuckets.
If you are going to generate your own unique ID then you should override GetHashCode and Equals.
That way .NET can use your unique identifier for equality comparison.
.NET GetHashCode() is not required nor guaranteed to be unique.
.NET GetHashCode() is not just limited to Int32.
.NET GetHashCode() is Int32.
If the GetHashCode are not equal then two objects are not equal.
If GetHashCode is equal then two objects may or may not be equal.
Equals is the tie breaker.
For speed first GetHashCode is compared.
GetHashCode is also use for hashbuckets for speed of collections like HashSet and Dictionary.
If a hash is unique then it is considered a perfect hash.
Classic example
class Point: object
{
protected int x, y;
public Point(int xValue, int yValue)
{
x = xValue;
y = yValue;
}
public override bool Equals(Object obj)
{
// Check for null values and compare run-time types.
if (obj == null || GetType() != obj.GetType())
return false;
Point p = (Point)obj;
return (x == p.x) && (y == p.y);
}
public override int GetHashCode()
{
return x ^ y;
}
}
Since Point has Int32 X Int32 possible values then obviously it cannot be uniquely identified with a single Int32. Still GetHashCode is of value and required. There is only 1/Int32 chance the more expensive Equals will be required and the GetHashCode is used for hash buckets.
Consider simple point
class Point: object
{
protected byte x, y;
public Point(byte xValue, byte yValue)
{
x = xValue;
y = yValue;
}
public override bool Equals(Object obj)
{
// Check for null values and compare run-time types.
if (obj == null || GetType() != obj.GetType())
return false;
Point p = (Point)obj;
return (x == p.x) && (y == p.y);
}
public override int GetHashCode()
{
return (x * 256) + y;
}
}
In this simple point GetHashCode will uniquely identify the object. You cannot override one of the other. Must override neither or both.
Upvotes: 3
Reputation: 4609
It depends on what you are using the unique Id for. It sounds like you are using to identify object instances, which might mean that Hash Codes are not what you want.
If two objects are .Equals() of each other, they are supposed to have the same hash code, but as you discovered, the reverse is not true (having the same hash code doesn't mean they are .Equals()).
What do you need the unique Id's for? If you aren't using hash codes to put the objects in a lookup you might be better off assigning them a unique id like a Guid (var uniqueId = Guid.NewGuid()
).
Upvotes: 2
Reputation: 2023
Perhaps it would be best to move away from hash codes altogether? GetHashCode
is nice for a quick and easy fix, but if you need real IDs for objects, then you should create real IDs. Something like a 32/64 bit auto-incrementing integer would likely be plenty.
While the collision rate of a Hash code is tied to the length of the hash, you are still not guaranteed to reach the maximum number of unique hashes possible before getting a collision. If you manage the IDs yourself you can plan ahead to have enough IDs available.
Also - your comment about the GetHashCode() different between versions of the framework. I can only imagine this would matter in your situation if you were persisting the hashes to some sort of save file, and then trying to re-load them only to find out they don't match the hash of the running program because they were saved by a different version of the framework. If this is the case, I would suggest even more that you create and manage IDs on the objects yourself.
Upvotes: 3
Reputation: 20366
No hash function guarantees the uniqueness of value returned.
It depends how small the probability of collision it can be.
GetHashCode() returns a 32-bit integer, which may not be enough to assume uniqueness. Consider other algorithm such as SHA-1, SHA-2, which the length of hash is longer, probability of collision is much lower than a 32-bit integer.
Upvotes: 1