Reputation: 1744
I have 3 ElasticSearch nodes in a cluster on AWS EC2. My client apps use connection pooling and have the public IP addresses for all 3 nodes in their config files.
The problem I have is that EC2 seems to occasionally reassign public IP addresses for these instances. They also change if I stop and restart an instance.
My app will actually stay online since the connection pool will round robin the three known IP addresses, but eventually, all three will change and the app will stop working.
So, how should I be setting up an ElasticSearch cluster on EC2 so that my clients can continue to connect even if the instances change IP addresses?
Is there an option I'm missing?
Upvotes: 2
Views: 3675
Reputation: 27515
Use one or two query only nodes - referred to in the documentation as "non data" nodes.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-node.html
In front of the cluster we can start one or more "non data" nodes which will start with HTTP enabled. All HTTP communication will be performed through these "non data" nodes.
The benefit of using that is first the ability to create smart load balancers. These "non data" nodes are still part of the cluster, and they redirect operations exactly to the node that holds the relevant data. The other benefit is the fact that for scatter / gather based operations (such as search), these nodes will take part of the processing since they will start the scatter process, and perform the actual gather processing.
These nodes don't need much disk (they are query and index processing only). You route all your requests thru them. You can add more and more data nodes as you ingest more data without changing these "non data" nodes. You run a couple of them (to be safe) and use either DNS or Elastic IP addresses. You need far fewer of the IP addresses as these are not data nodes and you tend not to need to change these as frequently as you do data nodes.
This configuration approach is documented in the elasticsearch.yml file, quoted below:
You want this node to be neither master nor data node, but to act as a "search load balancer" (fetching data from nodes, aggregating results, etc.)
node.master: false
node.data: false
Upvotes: 1
Reputation: 786
I haven't seen the changing IP address on a running instance that you're describing, but using this approach, it shouldn't matter:
Use DNS names for everything, not IP addresses.
Lets say you want to hit your cluster via http://elastic.rabblerabble.com:9200.
Create the EC2 instances for your nodes. Name them elastic-0, elastic-1, and elastic-2.
In EC2 Load Balancers, create an ELB named 'es-elb' that includes each of these instances by name, with port forwarding of port 9200.
In Route 53, create unique CNAMEs for each of your instances, with the Public DNS as the value, and a CNAME for your ELB:
Name Type Value
elastic-0.rabblerabble.com. CNAME Public DNS of instance elastic-0
elastic-1.rabblerabble.com. CNAME Public DNS of instance elastic-1
elastic-2.rabblerabble.com. CNAME Public DNS of instance elastic-2
elastic.rabblerabble.com. CNAME Public DNS of ELB es-elb
There's more needed for security, health checks, etc. but that's outside the scope of the question.
Upvotes: 2