Reputation: 803
Hash in Ruby just uses its hash value (for strings and numbers). Internally, it uses the Murmur hash function. I wonder how it can can be done given that the probability of having the same hash value for two different keys is not zero.
Upvotes: 4
Views: 1346
Reputation: 4970
Can you share with us how you came to the conclusion that Ruby uses only the hash value to determine equality?
The text below is to explain to others your excellent point that the probability of computing the same hash value for two different keys is not zero, so how can the Hash class rely on just the hash value to determine equality?
For the purpose of this discussion I will refer to Ruby hashes as maps, so as not to confuse the 2 uses of the term hash in the Ruby language (1, a computed value on an object, and 2, a map/dictionary of pairs of values and unique keys).
As I understand it, hash values in maps, sets, etc. are used as a quick first pass at determining possible equality. That is, if the hashes of 2 objects are equal, then it is possible that the 2 objects are equal; but it's also possible that the 2 objects are not equal, but coincidentally produce the same hash value.
In other words, the only sure thing you can tell about equality from the hash values of the objects being compared is that if hash1 != hash2 then the objects are definitely not equal.
If the 2 hashes are equal, then the 2 objects must be compared by their content (in Ruby, by calling the ==
method, I believe).
So comparing hashes is not a substitute for comparing the objects themselves, it's just a quick first pass used to optimize performance.
Upvotes: 3
Reputation: 211720
Remember that a "hash table" or dictionary is perfectly okay with collisions. In fact, it's expected and accommodated in any reasonable implementation.
Ideally you strive to have a hash with as few collisions as possible, and there are entire doctoral level discussions on what makes a good hashing function, but they are inevitable. When a collision does occur then two values share the same index in the container.
Regardless of how a value is hashed, any potential match based on hash must be evaluated. A direct comparison is performed to ensure that the value you're accessing is the one requested, not one that coincidentally maps to the same spot.
Normal hash tables can be thought of as an array of arrays even though this is all completely hidden from you in general purpose use.
You can implement your own hash table in Ruby if you want to explore how this behaves:
class ExampleHash
include Enumerable
def initialize
@size = 9
@slots = Array.new(@size) { [ ] }
end
def [](key)
@slots[key.hash % @size].each do |entry|
if (entry[0] == key)
return entry[1]
end
end
nil
end
def []=(key, value)
entries = @slots[key.hash % @size]
entries.each do |entry|
if (entry[0] == key)
entry[1] = value
return
end
end
entries << [ key, value ]
end
end
This is made easy since every object in Ruby has a built-in hash
method that produces a large numerical value that's based on the object's content.
Upvotes: 2