Reputation: 31987
I need a queue which multiple threads can put stuff into, and multiple threads may read from.
Python has at least two queue classes, queue.Queue
and collections.deque
, with the former seemingly using the latter internally. Both claim to be thread-safe in the documentation.
However, the Queue
docs also state:
collections.deque
is an alternative implementation of unbounded queues with fast atomicappend()
andpopleft()
operations that do not require locking and also support indexing.
Which I guess I don't quite understand: Does this mean deque
isn't fully thread-safe after all?
If it is, I may not fully understand the difference between the two classes. I can see that Queue
adds blocking functionality. On the other hand, it loses some deque
features like support for the in
operator.
Is accessing the internal deque
object directly
x in Queue().queue
thread-safe?
Also, why does Queue
employ a mutex for its operations when deque
is thread-safe already?
Upvotes: 274
Views: 114044
Reputation: 107062
If all you're looking for is a thread-safe way to transfer objects between threads, then both would work (both for FIFO and LIFO). For FIFO:
Queue.put()
and Queue.get()
are thread-safeNote:
deque
might not be thread safe, I'm not sure.deque
does not block on pop()
or popleft()
so you can't base your consumer thread flow on blocking till a new item arrives.However, it seems that deque has a significant efficiency advantage. Here are some benchmark results in seconds using CPython 2.7.3 for inserting and removing 100k items
deque 0.0747888759791
Queue 1.60079066852
Here's the benchmark code:
import time
import Queue
import collections
q = collections.deque()
t0 = time.clock()
for i in xrange(100000):
q.append(1)
for i in xrange(100000):
q.popleft()
print 'deque', time.clock() - t0
q = Queue.Queue(200000)
t0 = time.clock()
for i in xrange(100000):
q.put(1)
for i in xrange(100000):
q.get()
print 'Queue', time.clock() - t0
Upvotes: 64
Reputation: 22665
queue.Queue
and collections.deque
serve different purposes. queue.Queue
is intended for allowing different threads to communicate using queued messages/data, whereas collections.deque
is simply intended as a data structure. That's why queue.Queue
has methods like put_nowait()
, get_nowait()
, and join()
, whereas collections.deque
doesn't. queue.Queue
isn't intended to be used as a collection, which is why it lacks the likes of the in
operator.
It boils down to this: if you have multiple threads and you want them to be able to communicate without the need for locks, you're looking for queue.Queue
; if you just want a queue or a double-ended queue as a datastructure, use collections.deque
.
Finally, accessing and manipulating the internal deque of a queue.Queue
is playing with fire - you really don't want to be doing that.
Upvotes: 420
Reputation: 61
Adding notify_all()
to each deque
append
and popleft
results in far worse results for deque
than the 20x improvement achieved by default deque
behavior:
deque + notify_all: 0.469802
Queue: 0.667279
@Jonathan modify his code a little and I get the benchmark using cPython 3.6.2 and add condition in deque loop to simulate the behaviour Queue do.
import time
from queue import Queue
import threading
import collections
mutex = threading.Lock()
condition = threading.Condition(mutex)
q = collections.deque()
t0 = time.clock()
for i in range(100000):
with condition:
q.append(1)
condition.notify_all()
for _ in range(100000):
with condition:
q.popleft()
condition.notify_all()
print('deque', time.clock() - t0)
q = Queue(200000)
t0 = time.clock()
for _ in range(100000):
q.put(1)
for _ in range(100000):
q.get()
print('Queue', time.clock() - t0)
And it seems the performance limited by
this function condition.notify_all()
collections.deque is an alternative implementation of unbounded queues with fast atomic append() and popleft() operations that do not require locking. docs Queue
Upvotes: 6
Reputation: 5520
All single-element methods on deque
are atomic and thread-safe. All other methods are thread-safe too. Things like len(dq)
, dq[4]
yield momentary correct values. But think e.g. about dq.extend(mylist)
: you don't get a guarantee that all elements in mylist
are filed in a row when other threads also append elements on the same side - but thats usually not a requirement in inter-thread communication and for the questioned task.
So a deque
is ~20x faster than Queue
(which uses a deque
under the hood) and unless you don't need the "comfortable" synchronization API (blocking / timeout), the strict maxsize
obeyance or the "Override these methods (_put, _get, ..) to implement other queue organizations" sub-classing behavior, or when you take care of such things yourself, then a bare deque
is a good and efficient deal for high-speed inter-thread communication.
In fact the heavy usage of an extra mutex and extra method ._get()
etc. method calls in Queue.py
is due to backwards compatibility constraints, past over-design and lack of care for providing an efficient solution for this important speed bottleneck issue in inter-thread communication. A list was used in older Python versions - but even list.append()/.pop(0) was & is atomic and threadsafe ...
Upvotes: 12
Reputation: 375
For information there is a Python ticket referenced for deque thread-safety (https://bugs.python.org/issue15329). Title "clarify which deque methods are thread-safe"
Bottom line here: https://bugs.python.org/issue15329#msg199368
The deque's append(), appendleft(), pop(), popleft(), and len(d) operations are thread-safe in CPython. The append methods have a DECREF at the end (for cases where maxlen has been set), but this happens after all of the structure updates have been made and the invariants have been restored, so it is okay to treat these operations as atomic.
Anyway, if you are not 100% sure and you prefer reliability over performance, just put a like Lock ;)
Upvotes: 14
Reputation: 140
(seems I don't have reputation to comment...) You need to be careful which methods of the deque you use from different threads.
deque.get() appears to be threadsafe, but I have found that doing
for item in a_deque:
process(item)
can fail if another thread is adding items at the same time. I got an RuntimeException that complained "deque mutated during iteration".
Check collectionsmodule.c to see which operations are affected by this
Upvotes: 1
Reputation: 34112
deque
is thread-safe. "operations that do not require locking" means that you don't have to do the locking yourself, the deque
takes care of it.
Taking a look at the Queue
source, the internal deque is called self.queue
and uses a mutex for accessors and mutations, so Queue().queue
is not thread-safe to use.
If you're looking for an "in" operator, then a deque or queue is possibly not the most appropriate data structure for your problem.
Upvotes: 2