DanielSank
DanielSank

Reputation: 3442

Why are dummy objects created in twisted's pb system?

As shown below, in twisted.spread.flavors.RemoteCache.unjellyFor, we create a dummy object called cProxy and return it to the rest of the client code, rather than returning self.

def unjellyFor(self, unjellier, jellyList):
    if unjellier.invoker is None:
        return setInstanceState(self, unjellier, jellyList)
    self.broker = unjellier.invoker
    self.luid = jellyList[1]
    cProxy = _newDummyLike(self)
    # XXX questionable whether this was a good design idea...
    init = getattr(cProxy, "__init__", None)
    if init:
        init()
    unjellier.invoker.cacheLocally(jellyList[1], self)
    cProxy.setCopyableState(unjellier.unjelly(jellyList[2]))
    # Might have changed due to setCopyableState method; we'll assume that
    # it's bad form to do so afterwards.
    self.__dict__ = cProxy.__dict__
    # chomp, chomp -- some existing code uses "self.__dict__ =", some uses
    # "__dict__.update".  This is here in order to handle both cases.
    self.broker = unjellier.invoker
    self.luid = jellyList[1]
    return cProxy

The body of _newDummyLike looks like this:

def _newDummyLike(instance):
    """
    Create a new instance like C{instance}.

    The new instance has the same class and instance dictionary as the given
    instance.

    @return: The new instance.
    """
    if isinstance(instance.__class__, type):
        # New-style class
        dummy = _DummyNewStyle()
    else:
        # Classic class
        dummy = _Dummy()
    dummy.__class__ = instance.__class__
    dummy.__dict__ = instance.__dict__
    return dummy

Since the dummy object cProxy shares its __dict__ and __class__ with the "real" object, I don't see the point of making the dummy at all. Why is the dummy created?

Upvotes: 2

Views: 100

Answers (2)

Glyph
Glyph

Reputation: 31910

The purpose of these "dummy" objects is distributed garbage collection.

First, let's consider the simple case of a Copyable. Each time you serialize it, your peer gets a new RemoteCopy. Simple - nothing to keep track of. Your Copyable can be easily garbage collected at any time.

Next, Referenceable. Every time you serialize it, your peer gets a new RemoteReference. Now we have a problem: if your peers still have that RemoteReference, they should be able to call methods on your Referenceable, which means your broker now holds a strong reference to your Referenceable. Slightly trickier, but still fairly simple: each time a RemoteReference is garbage collected, in RemoteReference.__del__, we send a decref message which tells the sender that their Referenceable is no longer referenced. When the count gets to zero, the strong reference may be eliminated, and it will be garbage collected naturally. This works because a RemoteReference is effectively immutable - all it contains is the opaque identifier of the object in question, a reference to the broker, and nothing else.

Finally, Cacheable. Here we run into a real conundrum. You serialize your Cacheable, and now the Broker needs to maintain a strong reference to the Cacheable, in order to be able to tell whether it's the same Cacheable being sent again, later. But on the other end of the wire, the RemoteCache - which needs to have a __del__ that informs us if the implicit reference to the Cacheable is going away - also has a strong reference from its broker, because the Cacheable may need to send updates to the RemoteCache.

This is a circular reference, and a bad one. We need some way to break the circularity so that the thing we hand to the application, which gets garbage collected, tracks each time the server sent us a different copy of the RemoteCache. It turns out that we can do that, and break the circular reference (remember, this code predates python's GC!) by haivng a separate instance object, each with its own __del__ method, for each RemoteCache sent across. But we can preserve the illusion of consistency by giving the application objects which share their internal state (__dict__), without letting objects with a __del__ become uncollectable by letting them participate in a cycle. This way, when RemoteCache objects get garbage collected, they each send a message, and we leave the "real" one (self in your example) with the strong reference from the broker, it becomes the "zeroeth" reference, which is why it's copied the first time as well (you can see there's only one other call site for _newDummyLike which is where repeat references to the same cacheable are deserialized - getting a new proxy object each time).

Hopefully this makes more sense!

Upvotes: 2

Jean-Paul Calderone
Jean-Paul Calderone

Reputation: 48335

This is just a trick to make the necessary object. Creating a new instance of a totally arbitrary, user-defined type is harder. What arguments do you pass to its __init__? What if it's __init__ has undesirable side-effects? Maybe you can use its __new__ method instead - but what arguments does that take? Or maybe it doesn't even have a __new__ method, or maybe __new__ has side-effects... etc.

Compared to figuring all of that out, which may not even be possible, this trick is pretty simple and straight-forward.

Upvotes: 1

Related Questions