Jack
Jack

Reputation: 5870

how to sync up two ElasticSearch cluster

I need to setup a replicated ES clusterII in data centerII, the ES clusterII just need to sync up with ES clusterI which in data centerI. So far my idea is that store snapshot in custerII and restore the snapshot in order to sync up clusterI. But this way kind of having some delay. Is there any better way please.

Upvotes: 0

Views: 1702

Answers (1)

signus
signus

Reputation: 1148

The ability to cluster is a concept baked into ElasticSearch. However it was not designed to be scaled across datacenters because this involves network latency, but it can do it.

The idea behind ElasticSearch is to have a highly-available cluster that replicates shards within itself (i.e. a replica level of 2 in a cluster means that you have 2 copies of the data across your cluster). This means one cluster alone is its own backup.

First, if you don't have it configured as a cluster, do so by adding the following to your /etc/elasticsearch/elasticsearch.yml (or wherever you put your config):

/etc/elasticsearch/elasticsearch.yml:
    cluster.name: thisismycluster
    node.name: ${HOSTNAME}

Alternatively, you can make node.name whatever you want, but it's best to put in your hostname.

You also want to make sure you have the ElasticSearch service bound to a particular address and/or interface, where the interface is probably your best bet because you need a point-to-point link across those datacenters:

/etc/elasticsearch/elasticsearch.yml:
    network.host: [_tun1_]

You will need to make sure you set a list of discovery hosts, which means that on every host in the cluster, if their cluster.name parameter name matches, they will be discovered and assigned to that cluster. ElasticSearch takes care of the rest, it's magical!

You may add the host by name (only if defined in your /etc/hosts or DNS across your datacenters can resolve it) or IP:

/etc/elasticsearch/elasticsearch.yml:
    discovery.zen.ping.unicast.hosts: ["ip1", "ip2", "..."]

Save the config and restart ElasticSearch:

sudo systemctl restart elasticsearch
OR
sudo service elasticsearch restart

If you aren't using systemd (depending on your OS), I would highly suggest using it.

I will tell you though that doing snapshots with ElasticSearch is a terrible idea, and to avoid it at all costs because ElasticSearch built the mentality of high-availability into the application already - this is why this application is so powerful and is being heavily adopted by the community and companies alike.

Upvotes: 1

Related Questions