Edwardr
Edwardr

Reputation: 2956

What's the pythonic way to deal with worker processes that must coordinate their tasks?

I'm currently learning Python (from a Java background), and I have a question about something I would have used threads for in Java.

My program will use workers to read from some web-service some data periodically. Each worker will call on the web-service at various times periodically.

From what I have read, it's preferable to use the multiprocessing module and set up the workers as independent processes that get on with their data-gathering tasks. On Java I would have done something conceptually similar, but using threads. While it appears I can use threads in Python, I'll lose out on multi-cpu utilisation.

Here's the guts of my question: The web-service is throttled, viz., the workers must not call on it more than x times per second. What is the best way for the workers to check on whether they may request data?

I'm confused as to whether this should be achieved using:

Of course, I guess this may come down to how I keep track of the calls per second. I suppose one option would be for the workers to call a function on some other object, which makes the call to the web-service and records the current number of calls/sec. Another option would be for the function that calls the web-service to live within each worker, and for them to message a managing object every time they make a call to the web-service.

Thoughts welcome!

Upvotes: 13

Views: 687

Answers (3)

quamrana
quamrana

Reputation: 39344

I think that you'll find that the multiprocessing module will provide you with some fairly familiar constructs.

You might find that multiprocessing.Queue is useful for connecting your worker threads back to a managing thread that could provide monitoring or throttling.

Upvotes: 2

Oben Sonne
Oben Sonne

Reputation: 9953

Not really an answer to your question, but an alternative approach to your problem: You could get rid of synchronization issues when doing requests event driven, e.g. by using the Python async module or Twisted. You wouldn't benefit from multiple CPUs/cores, but in context of network communication that's usually negligible.

Upvotes: 0

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 798386

Delegate the retrieval to a separate process which queues the requests until it is their turn.

Upvotes: 2

Related Questions