Reputation: 61
My question is: Does Object class default implementation of hashcode uses identityhashcode ? I think it does. Please correct me if it does not.
Assuming it does here is my problem: Let us say an object whose hashCode() is called once is relocated during heap compaction (and stores identityhashcode with it) and a new different object is created in the old location of 1st object. In this case identityhashcode of both the object will be same even when objects are different. How can this be explained ?
Upvotes: 0
Views: 218
Reputation: 81277
From what I understand, some Java implementations implement identityHashCode
by reserving two bits the an object header to divide objects into three categories:
Those for which identityHashCode()
has never been called.
Those for which the first-ever call to identityHashCode()
occurred after the most recent time the object moved.
Those for which the first-ever call to identityHashCode()
occurred before the most recent time the object moved.
For an objects of the first category, a call to identityHashCode()
will return a value related to the object's address and set a flag to change the object into the second category. For an object in the second category, the identityHashCode()
will compute the same value from the address as the first call. Any time the GC is going to move an object, it checks whether it is in the second category above. If so, the GC reserves an extra four bytes at the destination which will be used to hold the identity hash value the object had before it moved.
If an object has its hash code taken, the effective size of the object will increase by four bytes the next time the GC has to move it. Most objects never have their identity hash code taken, however, and some of them have their identity hash code taken but are immediately collected. Having an object's first call to identityHashCode()
allocate an extra four bytes for an object would be difficult, but allocating an extra four bytes when the object is moved is not a problem. Using the object's address as its hash value between the first call to identityHashCode()
and the next time the object moves avoids the need to allocate the storage when the space following the object is in use.
Note that while an identity hash method could "legally" use the bit pattern of the address directly, doing so could cause excessive number of hash duplicates (e.g. the first object created following one GC cycle could easily have the same address as the first object created after another). A simple way to avoid that problem is to have the system either add, xor, or otherwise combine the address with a value which gets changed after every GC cycle. Such an approach would make it easy to spread hash values out over a wider range.
Upvotes: 2
Reputation: 26185
The key document for this is the Object API documentation for hashCode
. It says, in part "Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified."
In the case of Object
, equality does not depend on any information in the object, only on whether two references point to the same object, so there is no condition under which the hash code can change.
The documentation also says "As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the Java™ programming language.)"
In order to conform to the contract, "internal address" has to be something that stays with the object for the whole of its lifetime, not something that can be affected by heap compaction.
identityHashCode
is defined in terms of the Object
hash code, so this also controls its behavior.
Upvotes: 1