stecog
stecog

Reputation: 2344

Elasticsearch cluster 'master_not_discovered_exception'

i have installed elasticsearch 2.2.3 and configured in cluster of 2 nodes

Node 1 (elasticsearch.yml)

cluster.name: my-cluster
node.name: node1
bootstrap.mlockall: true
discovery.zen.ping.unicast.hosts: ["ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com", "ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com"]
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.multicast.enabled: false
indices.fielddata.cache.size: "30%"
indices.cache.filter.size: "30%"
node.master: true
node.data: true
http.cors.enabled: true
script.inline: false
script.indexed: false
network.bind_host: 0.0.0.0

Node 2 (elasticsearch.yml)

cluster.name: my-cluster
node.name: node2
bootstrap.mlockall: true
discovery.zen.ping.unicast.hosts: ["ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com", "ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com"]
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.multicast.enabled: false
indices.fielddata.cache.size: "30%"
indices.cache.filter.size: "30%"
node.master: false
node.data: true
http.cors.enabled: true
script.inline: false
script.indexed: false
network.bind_host: 0.0.0.0

If i get curl -XGET 'http://localhost:9200/_cluster/state?pretty' i have:

{
  "error" : {
    "root_cause" : [ {
      "type" : "master_not_discovered_exception",
      "reason" : null
    } ],
    "type" : "master_not_discovered_exception",
    "reason" : null
  },
  "status" : 503
}

Into log of node 1 have:

[2016-06-22 13:33:56,167][INFO ][cluster.service          ] [node1] new_master {node1}{Vwj4gI3STr6saeTxKkSqEw}{127.0.0.1}{127.0.0.1:9300}{master=true}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-06-22 13:33:56,210][INFO ][http                     ] [node1] publish_address {127.0.0.1:9200}, bound_addresses {[::]:9200}
[2016-06-22 13:33:56,210][INFO ][node                     ] [node1] started
[2016-06-22 13:33:56,221][INFO ][gateway                  ] [-node1] recovered [0] indices into cluster_state

Into log of node 2 instead:

[2016-06-22 13:34:38,419][INFO ][discovery.zen            ] [node2] failed to send join request to master [{node1}{Vwj4gI3STr6saeTxKkSqEw}{127.0.0.1}{127.0.0.1:9300}{master=true}], reason [RemoteTransportException[[node2][127.0.0.1:9300][internal:discovery/zen/join]]; nested: IllegalStateException[Node [{node2}{_YUbBNx9RUuw854PKFe1CA}{127.0.0.1}{127.0.0.1:9300}{master=false}] not master for join request]; ]

Where the error?

Upvotes: 27

Views: 112979

Answers (9)

pratsy
pratsy

Reputation: 599

In case you are using elasticsearch 7

Update elasticsearch.yml file at /etc/elasticsearch :

node.name: "node-1" 

network.host: ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com

http.port: 9200

cluster.initial_master_nodes: ["node-1"]

here node.name and first value of cluster.initial_master_nodes should be same

Upvotes: 10

Andre Leon Rangel
Andre Leon Rangel

Reputation: 1799

I used AWS EC2 instances with Centos7.

My issue is that there were no IP routes. I had to open some firewall ports with the instructions below and that solved the problem.

sudo firewall-cmd --permanent --add-port=8080/tcp
sudo firewall-cmd --permanent --add-port=9200/tcp
sudo firewall-cmd --permanent --add-port=9300/tcp

Upvotes: 0

Abhishek katiyar
Abhishek katiyar

Reputation: 101

This could be reason for master node was not getting discovered. If EC2 instances are under same VPC then please provide Private IP in /etc/elasticsearch/elasticsearch.yml as shown below:

cluster.initial_master_nodes: ["<PRIVATE-IP"]

Note : After above change in configuration, please restart the elastic search service e.g. sudo service elasticsearch stop and sudo service elasticsearch stop in case of OS is ubuntu.

Upvotes: 8

avp
avp

Reputation: 3330

Sandeep's answer above hinted to me that nodes aren't able to talk to each other. When I dug more into this, I found that I was missing inbound rule for TCP, port 9300 in EC2's security group. Added the rules and restarted elasticsearch service on all nodes and it started working.

Upvotes: 0

Ryabchenko Alexander
Ryabchenko Alexander

Reputation: 12340

If master starts with index made in old version of elastic, and worker start with empty index and init it with new version you can also have this error

Upvotes: 1

Sandeep Kanabar
Sandeep Kanabar

Reputation: 1302

The root cause of master not discovered exception is the nodes are not able to ping each other on port 9300. And this needs to be both ways. i.e node1 should be able to ping node2 on 9300 and vice versa.

Note : Elasticsearch reserves port 9300-9400 for cluster communication and port 9200-9300 for accessing the elasticsearch APIs.

A simple telnet would be able to confirm. From node1, fire telnet node2 9300.

If it succeeds, next from node2 try telnet node1 9300.

In case of master not discovered exception, at least one of the above telnet would be failing.

In case you don't have telnet installed, you could even do a curl.

Hope this helps.

Upvotes: 12

baj9032
baj9032

Reputation: 2582

In my system firewall is on that's why i got same error when i turn off the firewall then every thing is working fine. So make sure that your firewall is off.

Upvotes: 1

stecog
stecog

Reputation: 2344

I resolved with this line:

network.publish_host: ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com

every elasticsearch.yml config file must have this line with your hostname

Upvotes: 13

pickypg
pickypg

Reputation: 22332

There's a lot of settings in here that you either don't want (like the fielddata one) or don't need. Also, you're clearly using AWS EC2 instances, so you should use the cloud-aws plugin (broken into separate plugins in ES 5.x). This will provide a new discovery model that you can take advantage of instead of zen.

For each node, you'll want to therefore install the cloud-aws plugin (assuming ES 2.x):

$ bin/plugin install cloud-aws

Once installed on each node, then you can use it to take advantage of the discovery-ec2 component:

# Guarantee that the plugin is installed
plugin.mandatory: cloud-aws

# Discovery / AWS EC2 Settings
discovery
  type: ec2
  ec2:
    availability_zones: [ "us-east-1a", "us-east-1b" ]
    groups: [ "my_security_group1", "my_security_group2" ]

cloud:
  aws
    access_key: AKVAIQBF2RECL7FJWGJQ
    secret_key: vExyMThREXeRMm/b/LRzEB8jWwvzQeXgjqMX+6br
    region: us-east-1
  node.auto_attributes: true

# Bind to the network on whatever IP you want to allow connections on.
# You _should_ only want to allow connections from within the network
# so you only need to bind to the private IP
node.host: _ec2:privateIp_

# You can bind to all hosts that are possible to communicate with the
# node but advertise it to other nodes via the private IP (less
# relevant because of the type of discovery used, but not a bad idea).
#node:
#  bind_host: [ _ec2:privateIp_, _ec2:publicIp_, _ec2:publicDns_ ]
#  publish_host: _ec2:privateIp_

# Node-specific settings (note: nodes default to be master and data nodes)
node:
  name: node1
  master: true
  data: true

# Constant settings
cluster.name: my-cluster
bootstrap.mlockall: true

Finally, your problem is that you are failing master election for some reason that most likely stems from connectivity issues. The above configuration should fix those issues, but you have one other critical issue: you are specifying the discovery.zen.minimum_master_nodes setting incorrectly. You have two eligible master nodes, but you are asking Elasticsearch to require only one for any election. That means, in isolation, each eligible master node can decide that they have a quorum, and therefore elect themselves separately (thus giving two masters and effectively two clusters). This is bad.

You must therefore always set that setting using quorum: (M / 2) + 1, rounded down, where M is the number of master eligible nodes. So:

M = 2
(2 / 2) + 1 = (1) + 1 = 2

If you had 3, 4, or 5 master eligible nodes, then it would be:

M = 3
(3 / 2) + 1 = (1.5) + 1 = 2.5 => 2

M = 4
(4 / 2) + 1 = (2) + 1 = 3

M = 5
(5 / 2) + 1 = (2.5) + 1 = 3.5 => 3

So, you should also be setting, in your case:

discovery.zen.minimum_master_nodes: 2

Note, you could add this either as another line or, you could modify the discovery block from above (it really comes down to style of YAML):

discovery
  type: ec2
  ec2:
    availability_zones: [ "us-east-1a", "us-east-1b" ]
    groups: [ "my_security_group1", "my_security_group2" ]
  zen.minimum_master_nodes: 2

Upvotes: 2

Related Questions