Reputation: 4408
So I have 2 main worker processes one is supposed to fetch content of a URL and the other will insert each of the anchor links into a postgre table. Now I am running 12 instances of the first process all of which taking their URL from a single URL queue, and then placing the anchors in a second queue but how do I have an other set of threads trying to push the Anchors into the table? when I start the threads the find their queue empty and they die, if I disable that feature they wont die when the work is done, how do I manage this, and by the way is it better to use process instead of thread because of presumably intensive IO interaction involved?
Upvotes: 1
Views: 359
Reputation: 16525
you need two queues the URLFetchers
will pop URLs from one queue and push into a second one, then the AnchorInserters
should pop from this second queue to process the data. This organisation should give you a good sync mechanism for your problem.
Edit: to avoid worker exiting
You need to block till one element is available.
while True:
element = queue.get(block=True, timeout=None)
#do worker's task
From python's queue.get
doc ...
If optional args block is true and timeout is None (the default), block if necessary until an item is available.
Upvotes: 1