Reputation: 14699
From my understanding, RabbitMQ clustering is for scalability not availability, but using mirrored queues allows for availability as well in that if that master fails, an up-to-date slave can be promoted to master.
From the documentation:
Messages published to the queue are replicated to all slaves. Consumers are connected to the master regardless of which node they connect to, with slaves dropping messages that have been acknowledged at the master. Queue mirroring therefore enhances availability, but does not distribute load across nodes (all participating nodes each do all the work).
Therefore, load-balancing across the nodes for a given queue doesn't make sense as this will always add an extra trip from the node contacted to the master node for the queue (unless I'm misunderstanding something). Hence, we'd want to always be able to know which node is the master for a given queue.
I haven't really worked with RabbitMQ very much, so perhaps I'm just missing it in the documentation, but it seems that there's no way to determine a mirrored-queue's master's ip if there was a master failure and a slave was promoted to master. Every source that I see merely remarks on one's ability to set the initial master node, which isn't very helpful for me. For any time t, how do I find the master node ip for a given queue?
PS: It also seems bad to simply have the nodes behind a load-balancer since if there's some network partition (which can occur even with nodes in the same LAN), then we'd potentially be hitting nodes that can't communicate with the master for the queue OR worse there could be a split brain that we'd be evolving, if you will.
Upvotes: 3
Views: 2334
Reputation: 4106
You can create a smart client which maintain queues mirroring topology. It is possible using the Management Plugin and its REST API.
eg. for a queue, curl -i -u guest:guest http://[HOST]:[PORT]/api/queues/[VHOST]/[QUEUE]
will return the following payload:
{
"messages": 0,
"slave_nodes": [
"rabbit@node1",
"rabbit@node0"
],
"synchronised_slave_nodes": [
"rabbit@node0",
"rabbit@node1"
],
"recoverable_slaves": [
"rabbit@node0"
],
"state": "running",
"name": "myQueue",
=>"node": "rabbit@node2"
}
For myQueue your client will favor connection to node2
(the myQueue master node) to minimize HOP.
I'm not sure if it worth the cost. It will increase the number of connections and the client complexity. I would be happy to receive feeback if you implement somethink.
Upvotes: 1
Reputation: 10192
You don't need the master node's IP, you just need queues to be mirrored, that way all the messages in the queues are on all the nodes. In the paragraph above the one you quoted is this sentence
Each mirrored queue consists of one master and one or more slaves, with the oldest slave being promoted to the new master if the old master disappears for any reason.
so the words master and slave relate to queues, not rabbitmq nodes, I'm guessing here is the confusion. Once I read what the question and then again the docs, it got me thinking for a while but we can't say that a mirrored queue consists of master and slaves of rabbitmq nodes ;)
As for load balancing for the (of the?) cluster, you can do it so that the clients are always connecting to the rabbitmq node which is alive by using the actual load balancer, or by making clients "smarter" - i.e. they should reconnect to IP of a another node if the (original) master node goes down. The first approach is reccommended, just look for Connecting to Clusters from Clients here.
Upvotes: 1