Rares Dima
Rares Dima

Reputation: 1767

Producer-consumer architecture over network in python?

I need a producer-consumer kind of architecture, where the producer puts data in a queue over and over, and then a consumer reads from that queue as fast as it can process the data.

For the producer and consumer running in separate processes we already have multiprocessing, with Queue where you have put and get. So even if the producer runs as 2-3 times the speed of the consumer, all the data is in the queue (assume memory use is not a problem) and the consumer just calls q.get whenever it needs to.

But I need the producer and consumer to be connected over a network, so probably tough a socket (but I am open to other methods). The big problem with sockets is that they do not separate objects automatically like queues do.

For a multiprocessing.Queue if I call q.get I get the next object, the queue takes care of how many bytes to read and recreates the object for me, q.get just returns the object. With a socket I have to pickle.dumps to send it and then I need to be careful how many bytes to read from the socket (in case there is more than 1 object in the socket) and then pickle.loads the result. The main problem is keeping track of object sizes.

If I put 10 objects of different sizes that add up to 1000 bytes in a Queue then the queue takes care of how many bytes to read for every object when calling q.get. For a socket if I pickle the 10 objects and send them, the socket has no idea how to split the big 1000 byte string inside it, and creating a mechanism for this means adding alot of new code.

Is there some kind of... socket-based Queue or similar?

Upvotes: 0

Views: 227

Answers (1)

felipe
felipe

Reputation: 8045

This is usually solved with an external software that will act as a broker for the producer and consumer over the internet. There are a few open source projects you can look into;

They are all different in their own way, but they all have Python libraries you can easily pip install to begin using them. All of them will require that a third process is running to serve as the broker of messages.

Similarly, there are paid products for this as well - typically hosted in one of the big cloud providers - like AWS SQS.

This is not to say that it is not possible to create a custom socket or server implementation to do this... but, a lot of times in programming, it's best not to try to rebuild the wheel.

Upvotes: 1

Related Questions