Reputation: 75
I have a cockroachdb cluster of 3 nodes, that are on 3 different machines, orchestrated by Docker Swarm.
For a reason that is not link to cockroachdb, my whole swarm goes down and every cockroachdb containers stops. Now, I need to recover and start again the cluster.
The issue is, during the boot, cockroach want to reach other nodes before start up:
* WARNING: The server appears to be unable to contact the other nodes in the cluster. Please try:
*
* - starting the other nodes, if you haven't already;
* - double-checking that the '--join' and '--listen'/'--advertise' flags are set up correctly;
* - running the 'cockroach init' command if you are trying to initialize a new cluster.
The question is, how can I start the first container without an other one available? or do I need to initiate a new cluster?
Upvotes: 2
Views: 1040
Reputation: 21115
Cockroach nodes need to be able to reach each other to start up again. In most scenarios, the addresses of the nodes don't change and the nodes will automatically try the previously seen addresses (this is persisted to local disk alongside the cockroach data). If the addresses have changed between invocations, you need to tell the nodes about the new addresses.
Given three nodes you could specify all node addresses on --join
:
# on node 1:
cockroach <flags> --join=node1address,node2address,node3address
# on node 2:
cockroach <flags> --join=node1address,node2address,node3address
# on node 3:
cockroach <flags> --join=node1address,node2address,node3address
You could also specify any subset (eg: --join=node1address
on all nodes, or addresses of all other nodes).
You must not run init
again. It is only needed to initialize the first node and is done exactly once in the lifetime of a cluster.
Upvotes: 3