Reputation: 2865
I wonder how weak references work internally, for example in .NET or in Java. My two general ideas are:
Either of these solutions seems clean nor efficient. Does anyone know how it is actually done?
Upvotes: 20
Views: 4725
Reputation: 6988
It seems that implementation of weak references is well-kept secret in the industry ;-). For example, as of now, wikipedia article lacks any implementation details. And look at the answers above (including the accepted): "go look at the source" or "I think" ;-\ .
Of all the answers, only the one referencing Python's PEP 205 is insightful. As it says, for any single object, there can be at most one weak reference, if we treat weakref as an entity itself.
The rest describes Squirrel language implementation. So, weakref is itself an object, when you put weak reference to an object in some container, you actually put reference to weakref object. Each ref-countable object has field to store pointer to its weakref, which is NULL until weakref to that object is actually requested. Each object has method to request weakref, which either returns existing (singleton) weakref from the field, or creates it and caches in the field.
Of course, weakref points to the original object. So, then you just need to go thru all the available places where references to objects are handled and add transparent handling of weakrefs (i.e. automatically dereference it). ("Transparent" alternative is to add virtual "access" method which will be identity for most objects, and actual dereference for weakref.)
And as object has pointer to its weakref, then the object can NULLify the weakref in own destructor.
This implementation is pretty clean (no magic "calls into GC" and stuff) and has O(1) runtime cost. Of course, it's pretty greedy of memory - need to add +1 pointer field to each object, even though typically for 90+% objects that would be NULL. Of course, VHLLs already have large memory overhead per object, and there may be chance to compact different "extra" fields. For example, object type is typically a small enumeration, so it may be possible to merge type and some kind of weakref reference into single machine word (say, keep weakref objects in a separate arena, and use index to that).
Upvotes: 5
Reputation: 81159
The normal approach, I think, is for the system to maintain some sort of list of weak references. When the garbage collector executes, before dead objects are removed, the system iterates through the list of weak references and invalidates any reference whose target has not been tagged live. Depending upon the system, this may occur before or after the system temporarily resurrects objects which are eligible for immediate finalization (in the case of .net, there are two kinds of WeakReference
--one of which is effectively processed before the system scans for finalizers, meaning that it will become invalid when its target becomes eligible for finalization, and one of which is processed after).
Incidentally, if I were designing a gc-based framework, I would add a couple of other goodies: (1) a means of declaring a reference-type storage location as holding a reference that's primarily of interest to someone else, and (2) A variety of WeakReference
which would could indicate that the only references to an object are in "of interest to someone else" storage locations. Although WeakReference
is a useful type, the act of turning a weak reference into a strong reference may prevent the system from ever recognizing that nobody would mind if its target disappeared.
Upvotes: 0
Reputation: 12770
Not sure I understood your question, but you can have a look at the implementation for the class WeakReference and its superclass Reference in Java. It is well commented and you can see it has a field treated specially by the GC and another one used directly by the VM.
Upvotes: 5
Reputation: 177574
Python's PEP 205 has a decent explanation of how weak references should behave in Python, and this gives some insight into how they can be implemented. Since a weak reference is immutable, you could have just one for each object, to which you pass out references as needed. Thus, when the object is destroyed, only one weak reference needs to be invalidated.
Upvotes: 5
Reputation: 1062770
In .NET, when a WeakReference
is created, the GC is asked for a handle/opaque token representing the reference. Then, when needed, WeakReference
uses this handle to ask the GC if that handle is still valid (i.e. the original object still exists) - and if so, it can get the actual object reference.
So this is building a list of tokens/handles against object addresses (and presumably maintaining that list during defragmentation etc)
I'm not sure I 100% understand the three bullets, so I hesitate to guess which (if any) that is closest to.
Upvotes: 8