Reputation: 1767
I need a producer-consumer kind of architecture, where the producer puts data in a queue over and over, and then a consumer reads from that queue as fast as it can process the data.
For the producer and consumer running in separate processes we already have multiprocessing
, with Queue
where you have put
and get
. So even if the producer runs as 2-3 times the speed of the consumer, all the data is in the queue (assume memory use is not a problem) and the consumer just calls q.get
whenever it needs to.
But I need the producer and consumer to be connected over a network, so probably tough a socket (but I am open to other methods). The big problem with sockets is that they do not separate objects automatically like queues do.
For a multiprocessing.Queue
if I call q.get
I get the next object, the queue takes care of how many bytes to read and recreates the object for me, q.get
just returns the object. With a socket I have to pickle.dumps
to send it and then I need to be careful how many bytes to read from the socket (in case there is more than 1 object in the socket) and then pickle.loads
the result. The main problem is keeping track of object sizes.
If I put 10 objects of different sizes that add up to 1000 bytes in a Queue
then the queue takes care of how many bytes to read for every object when calling q.get
. For a socket if I pickle the 10 objects and send them, the socket has no idea how to split the big 1000 byte string inside it, and creating a mechanism for this means adding alot of new code.
Is there some kind of... socket-based Queue
or similar?
Upvotes: 0
Views: 227
Reputation: 8045
This is usually solved with an external software that will act as a broker for the producer and consumer over the internet. There are a few open source projects you can look into;
They are all different in their own way, but they all have Python libraries you can easily pip install
to begin using them. All of them will require that a third process is running to serve as the broker of messages.
Similarly, there are paid products for this as well - typically hosted in one of the big cloud providers - like AWS SQS.
This is not to say that it is not possible to create a custom socket or server implementation to do this... but, a lot of times in programming, it's best not to try to rebuild the wheel.
Upvotes: 1