Reputation: 19363
I am using RabbitMQ to have worker processes encode video files. I would like to know when all of the files are complete - that is, when all of the worker processes have finished.
The only way I can think to do this is by using a database. When a video finishes encoding:
UPDATE videos SET status = 'complete' WHERE filename = 'foo.wmv'
-- etc etc etc as each worker finishes --
And then to check whether or not all of the videos have been encoded:
SELECT count(*) FROM videos WHERE status != 'complete'
But if I'm going to do this, then I feel like I am losing the benefit of RabbitMQ as a mechanism for multiple distributed worker processes, since I still have to manually maintain a database queue.
Is there a standard mechanism for RabbitMQ dependencies? That is, a way to say "wait for these 5 tasks to finish, and once they are done, then kick off a new task?"
I don't want to have a parent process add these tasks to a queue and then "wait" for each of them to return a "completed" status. Then I have to maintain a separate process for each group of videos, at which point I've lost the advantage of decoupled worker processes as compared to a single ThreadPool concept.
Am I asking for something which is impossible? Or, are there standard widely-adopted solutions to manage the overall state of tasks in a queue that I have missed?
Edit: after searching, I found this similar question: Getting result of a long running task with RabbitMQ
Are there any particular thoughts that people have about this?
Upvotes: 15
Views: 7840
Reputation: 1079
Based on Brendan's extremely helpful answer, which should be accepted, I knocked up this quick diagram which be helpful to some.
Upvotes: 6
Reputation: 32392
I have implemented a workflow where the workflow state machine is implemented as a series of queues. A worker receives a message on one queue, processes the work, and then publishes the same message onto another queue. Then another type of worker process picks up that message, etc.
In your case, it sounds like you need to implement one of the patterns from Enterprise Integration Patterns (that is a free online book) and have a simple worker that collects messages until a set of work is done, and then processes a single message to a queue representing the next step in the workflow.
Upvotes: 1
Reputation: 54252
Use a "response" queue. I don't know any specifics about RabbitMQ, so this is general:
numSent == numResponded
, you're doneSomething to keep in mind is a timeout -- What happens if a child process dies? You have to do slightly more work, but basically:
This is called the Request Reply Pattern.
Upvotes: 25