sihil
sihil

Reputation: 2721

Elasticseach 6.1 EC2 cluster discovery not working

I'm currently upgrading a cluster to 6.1 and have been unable to get the nodes to discover each other at startup. The three individual nodes boot but then get stuck in a loop:

[2018-01-08T11:33:01,421][WARN ][o.e.d.z.ZenDiscovery     ] [ip-10-xxx-xxx-xxx] not enough master nodes discovered during pinging (found [[Candidate{node={ip-10-xxx-xxx-xxx}{gMlxxxxxRW-74axxxQ8V-3x}{6gBBYZxxxxxxxon=-1}]], but needed [2]), pinging again

The relevant part of my configuration is:

# Use the AWS private IP as self identifier
http.host: _ec2:privateIp_
network.host: _ec2:privateIp_
http.bind_host: 0.0.0.0
network.bind_host: 0.0.0.0

discovery.zen.hosts_provider: ec2
# These are expanded in my CloudFormation template
discovery.ec2.tag.Stack: @@STACK
discovery.ec2.tag.App: @@APP
discovery.ec2.tag.Stage: @@STAGE

Switching on debug for discovery (using logger.org.elasticsearch.discovery.ec2: "TRACE") nets me some evidence that the discovery process is failing:

[2018-01-08T11:32:58,419][TRACE][o.e.d.e.AwsEc2UnicastHostsProvider] [ip-10-xxx-xxx-xxx] building dynamic unicast discovery nodes...
[2018-01-08T11:32:58,420][DEBUG][o.e.d.e.AwsEc2UnicastHostsProvider] [ip-10-xxx-xxx-xxx] using dynamic discovery nodes []

Upvotes: 3

Views: 707

Answers (1)

sihil
sihil

Reputation: 2721

After further debugging I discovered that the documentation is not correct.

The documentation for the endpoint setting says: "The ec2 service endpoint to connect to. This will be automatically figured out by the ec2 client based on the instance location, but can be specified explicitly."

Unfortunately this isn't true and there is an open issue at https://github.com/elastic/elasticsearch/issues/27464.

When troubleshooting further I switched on AWS logging using logger.com.amazonaws.request: "DEBUG" in my elasticsearch config. This provided an entry in the log file stating that it was contacting us-east-1 despite the instance being in eu-west-1:

[2018-01-08T12:26:40,029][DEBUG][c.a.request              ] Sending Request: POST https://ec2.us-east-1.amazonaws.com / Parameters: ({"Action":["DescribeInstances"],"Version":["2016-11-15"] ...<snip>

It looks like they are aware and likely to fix it to make the behaviour of the plugin match the documentation (see https://github.com/elastic/elasticsearch/issues/27924) but in the meantime the fix is to explicitly set the endpoint using something like:

discovery.ec2.endpoint: ec2.eu-west-1.amazonaws.com

Upvotes: 5

Related Questions