Reputation: 703
I have been reading through the rabbitMQ tutorials and I was looking for some help regarding a setup I should use.
I have a list of tasks 1-50 which I want run once (and only once) on a set of 4 computers, each running a worker. I have set up a template similar to tutorial 2 at https://www.rabbitmq.com/tutorials/tutorial-two-python.html
Not all of the computers can run all of the tasks (they haven't all got the software installed)
What I am trying to achieve is the setup that allows the tasks sent to a worker to be filtered.
I read the tutorials on how to achieve this in a broadcast situation using routes however didn't quite grasp what I would need to do to map this back to a simpler push model similar to tutorial 2 (as I don't want to broadcast the jobs).
At some point down the line I would like to be able to scale the number of workers on each box dynamically based on load as well.
What is the best model I should use and are there any good tutorials or write up's that you can recommend to learn about this approach?
Cheers, Rob
Upvotes: 5
Views: 3135
Reputation: 72868
RabbitMQ doesn't provide a way to selectively consume messages from a queue. A consumer on a queue will always have a chance to receive any given message in that queue. Therefore, you have to pre-filter the messages in to queues for the specific type of work to be done. Once you've done that, your message consumers only consume from the queues for the type of work they can handle.
Say you have 3 types of work to do:
If you try to push messages for all three types of work in to a single queue, then your consumer has to be smart about which ones it can handle. This doesn't work. Your consumer would have to nack
the message back on to the queue if it can't handle it. But there's no guarantee that your message will be picked up by a different consumer that can handle it. It may go back to the same consumer which would then nack
it back on to the queue again.
This is the selective consumer anti-pattern in RabbitMQ.
Instead, you need to pre-filter your messages in to queues for specific types of work. You do this through the use of routing keys in the exchange -> queue bindings.
Using the three job types above, you could have a setup like this:
| exchange | routing key | queue | | -------- | ----------- | ----- | | jobs | job.a | job.a | | jobs | job.b | job.b | | jobs | job.c | job.c |
Your code that consumes these messages needs to know which type of job it can handle. Then, it only subscribes to the queue for that type of work.
Say you have 2 computers that are message consumers. Computer 1 can handle JobA and JobB. Computer 2 can handle JobB and JobC. In this scenario, you end up with 1 computer handling JobA, 2 computers handling JobB and 1 computer handling JobC. There are only 2 computers total, but each of them handles multiple jobs... only the jobs they know how to handle, though.
You guarantee Computer1 only gets JobA and JobB by only having it subscribe to the queue for job.a and job.b. The same goes for any other consumer in your system.
Having done this, scaling the number of workers is easy. You need more workers on JobA? No problem. Just add another consumer of the job.a queue.
Hope that helps!
Upvotes: 8