Reputation: 166
If I use the following code:
class Item
attr_reader :item_name, :qty
def initialize(item_name, qty)
@item_name = item_name
@qty = qty
end
def to_s
"Item (#{@item_name}, #{@qty})"
end
def hash
p "hash has been called"
self.item_name.hash ^ self.qty.hash
end
def eql?(other_item)
puts "#eql? invoked"
@item_name == other_item.item_name && @qty == other_item.qty
end
end
p Item.new("abcd", 1).hash
items = [Item.new("abcd", 1), Item.new("abcd", 1), Item.new("abcd", 1)]
p items.uniq
"hash has been called"
4379041107527942435
"hash has been called"
"hash has been called"
#eql? invoked
"hash has been called"
#eql? invoked
"hash has been called"
"hash has been called"
"hash has been called"
[Item (abcd, 1)]
I'm interpreting this to mean that the #hash method is being used to generate unique integers for each object, where then #eql? is invoked to check if the integers are equal as a way to check for duplicates. Is my interpretation correct?
Upvotes: 1
Views: 100
Reputation: 369468
No, your interpretation is not correct.
hash
does not generate unique integers, which is precisely why the eql?
call is necessary, and eql?
is not called on the integers but on the elements.
This is just plain old hashing, exactly identical to what is used in Hash
, Set
, and SortedSet
.
hash
is a hash function, i.e. a function which maps a large (potentially infinite) input space to a smaller, fixed-size output space. Since the output space is smaller than the input space, there must necessarily be at least two distinct objects with the same hash code, thus the hash values are not unique! (This is called the Pigeonhole Principle. Intuitively: if you have two drawers and three socks, then there must be at least one drawer with at least two socks in them.)
Because the hash values are not unique, two identical hash values don't tell you anything. If two hash values are different, then you know definitely that the two objects are also different. But if two hash values are the same, then the objects could still be different (this is called a Hash Collision), that's why you have to double-check using eql?
.
Upvotes: 4