Reputation: 535
I have a cluster of 3 rabbitmq nodes spread out on 3 different servers. The second and third node joins the first node and forms the cluster. In the process of testing for failover I am finding that once the primary node is killed, I am not able to make it rejoin the cluster. The documentation does not state that I have to use join_cluster or any other command, after startup. I tried join_cluster but it is rejected since the cluster with name is the same as the node host. Is there a way to make this work?
cluster_status displays the following (not from the primary node):
Cluster status of node 'rabbit@<secondary>' ...
[{nodes,[{disc,['rabbit@<primary>','rabbit@<secondary>',
'rabbit@<tertiary>']}]},
{running_nodes,['rabbit@<secondary>','rabbit@<tertiary>']},
{cluster_name,<<"rabbit@<primary>">>},
{partitions,[]}]
Upvotes: 2
Views: 3235
Reputation: 86
On one of the nodes which are in the cluster, use the command
rabbitmqctl forget_cluster_node rabbit@rabbitmq1
To make the current cluster forget the old primary. Now you should be able to rejoin the cluster on the old primary (rabbitmq1)
rabbitmqctl stop_app
rabbitmqctl join_cluster rabbit@rabbitmq2
rabbitmqctl start_app
See the reference cluster guide
Upvotes: 1
Reputation: 10202
A quote from here
Nodes that have been joined to a cluster can be stopped at any time. It is also ok for them to crash. In both cases the rest of the cluster continues operating unaffected, and the nodes automatically "catch up" with the other cluster nodes when they start up again.
So you just need to start the node that you killed/stopped. Doesn't make a difference if it's "primary" or not - if it was primary and then killed, some other node becomes the primary one.
I've just tested this (with docker of course) and works as expected.
Upvotes: 0