Reputation: 3604
Can someone provide an example and explain when and how to use Twisted's DeferredLock.
I have a DeferredQueue and I think I have a race condition I want to prevent, but I'm unsure how to combine the two.
Upvotes: 10
Views: 3656
Reputation: 48325
Use a DeferredLock
when you have a critical section that is asynchronous and needs to be protected from overlapping (one might say "concurrent") execution.
Here is an example of such an asynchronous critical section:
class NetworkCounter(object):
def __init__(self):
self._count = 0
def next(self):
self._count += 1
recording = self._record(self._count)
def recorded(ignored):
return self._count
recording.addCallback(recorded)
return recording
def _record(self, value):
return http.GET(
b"http://example.com/record-count?value=%d" % (value,))
See how two concurrent uses of the next
method will produce "corrupt" results:
from __future__ import print_function
counter = NetworkCounter()
d1 = counter.next()
d2 = counter.next()
d1.addCallback(print, "d1")
d2.addCallback(print, "d2")
Gives the result:
2 d1
2 d2
This is because the second call to NetworkCounter.next
begins before the first call to that method has finished using the _count
attribute to produce its result. The two operations share the single attribute and produce incorrect output as a consequence.
Using a DeferredLock
instance will solve this problem by preventing the second operation from beginning until the first operation has completed. You can use it like this:
class NetworkCounter(object):
def __init__(self):
self._count = 0
self._lock = DeferredLock()
def next(self):
return self._lock.run(self._next)
def _next(self):
self._count += 1
recording = self._record(self._count)
def recorded(ignored):
return self._count
recording.addCallback(recorded)
return recording
def _record(self, value):
return http.GET(
b"http://example.com/record-count?value=%d" % (value,))
First, notice that the NetworkCounter
instance creates its own DeferredLock
instance. Each instance of DeferredLock
is distinct and operates independently from any other instance. Any code that participates in the use of a critical section needs to use the same DeferredLock
instance in order for that critical section to be protected. If two NetworkCounter
instances somehow shared state then they would also need to share a DeferredLock
instance - not create their own private instance.
Next, see how DeferredLock.run
is used to call the new _next
method (into which all of the application logic has been moved). NetworkCounter
(nor the application code using NetworkCounter
) does not call the method that contains the critical section. DeferredLock
is given responsibility for doing this. This is how DeferredLock
can prevent the critical section from being run by multiple operations at the "same" time. Internally, DeferredLock
will keep track of whether an operation has started and not yet finished. It can only keep track of operation completion if the operation's completion is represented as a Deferred
though. If you are familiar with Deferred
s, you probably already guessed that the (hypothetical) HTTP client API in this example, http.GET
, is returning a Deferred
that fires when the HTTP request has completed. If you are not familiar with them yet, you should go read about them now.
Once the Deferred
that represents the result of the operation fires - in other words, once the operation is done, DeferredLock
will consider the critical section "out of use" and allow another operation to begin executing it. It will do this by checking to see if any code has tried to enter the critical section while the critical section was in use and if so it will run the function for that operation.
Third, notice that in order to serialize access to the critical section, DeferredLock.run
must return a Deferred
. If the critical section is in use and DeferredLock.run
is called it cannot start another operation. Therefore, instead, it creates and returns a new Deferred
. When the critical section goes out of use, the next operation can start and when that operation completes, the Deferred
returned by the DeferredLock.run
call will get its result. This all ends up looking rather transparent to any users who are already expecting a Deferred
- it just means the operation appears to take a little longer to complete (though the truth is that it likely takes the same amount of time to complete but has it wait a while before it starts - the effect on the wall clock is the same though).
Of course, you can achieve a concurrent-use safe NetworkCounter
more easily than all this by simply not sharing state in the first place:
class NetworkCounter(object):
def __init__(self):
self._count = 0
def next(self):
self._count += 1
result = self._count
recording = self._record(self._count)
def recorded(ignored):
return result
recording.addCallback(recorded)
return recording
def _record(self, value):
return http.GET(
b"http://example.com/record-count?value=%d" % (value,))
This version moves the state used by NetworkCounter.next
to produce a meaningful result for the caller out of the instance dictionary (ie, it is no longer an attribute of the NetworkCounter
instance) and into the call stack (ie, it is now a closed over variable associated with the actual frame that implements the method call). Since each call creates a new frame and a new closure, concurrent calls are now independent and no locking of any sort is required.
Finally, notice that even though this modified version of NetworkCounter.next
still uses self._count
which is shared amongst all calls to next
on a single NetworkCounter
instance this can't cause any problems for the implementation when it is used concurrently. In a cooperative multitasking system such as the one primarily used with Twisted, there are never context switches in the middle of functions or operations. There cannot be a context switch from one operation to another in between the self._count += 1
and result = self._count
lines. They will always execute atomically and you don't need locks around them to avoid re-entrancy or concurrency induced corruption.
These last two points - avoiding concurrency bugs by avoiding shared state and the atomicity of code inside a function - combined means that DeferredLock
isn't often particularly useful. As a single data point, in the roughly 75 KLOC in my current work project (heavily Twisted based), there are no uses of DeferredLock
.
Upvotes: 21