psaha4
psaha4

Reputation: 339

mesos slaves are not connecting with mesos masters

i have a setup where i am using two mesos masters and two mesos slasves. after making all the required configurations i can see two mesos masters are part of a cluster which is maintained by zookeepers and they are switching in a time interval. so i believe my mesos master config was successful.

now i have setup two mesos slaves and when i am starting mesos-slave service, i am expecting that mesos slaves will be available to the mesos masters web UI page. But i can not see any of them in the slaves tab.

I have followed a document mentioned here

i am not sure what could go wrong. i have verified the ip addresses of masters and slaves and they are configured correctly.

i don't know which part need to be checked for trouble shooting.

Upvotes: 2

Views: 3535

Answers (3)

Eren Güven
Eren Güven

Reputation: 2374

I have experienced scenarios that can cause this:

  • Slave terminated soon after start. This happens sometimes after package upgrades or when VM resource changes (eg. you scaled up CPUs on your cloud instance). You'll see a log line in mesos-slave logs about this, telling you to remove a directory. Usually the solution is to remove $WORK_DIR/meta. If you don't need to recover any executors, you can just remove the whole WORK_DIR. Then start the mesos-slave.
  • Slave can not connect to ZK to determine master. This can happen if you provide ZK entry for master discovery (you should, /etc/mesos/zk) instead of directly providing master options. Ensure mesos-slave -> zookeeper connectivity.
  • Similar to above, /etc/mesos/zk entry (or at least the zookeeper node) are not identical across your cluster.
  • Ensure mesos-master(s) <-> mesos-slave(s) connectivity

Upvotes: 0

Dharmit
Dharmit

Reputation: 5908

If you're setting up Mesos cluster on AWS or similar service, you might want to make sure that required ports are open. From the setup I did on AWS, I remember below ports but you might want to verify:

  • Zookeeper - 2181
  • Mesos Master - 5050
  • Mesos Slave - 5051

You can use telnet from master to slave and vice-versa on above mentioned ports to make sure that firewall is not the issue. Also, make sure that quorum value is set properly.

i have a setup where i am using two mesos masters and two mesos slasves.

It is recommended to have one or odd number of master nodes. You might want to add or reduce one master node. I can't find the link for this recommendation at the moment but I will add one once I find it.

zookeepers and they are switching in a time interval

If they're switching at certain time intervals, chances are configuration is not correct. Leading master role is switched only when the existing leading master node is malfunctioning for some reason. Otherwise, this doesn't keep switching by itself.

Besides that, providing logs from master and slave nodes would be of help. Logs on my CentOS 7 system are at /var/log/mesos/mesos-master.INFO and /var/log/mesos/mesos-slave.INFO for master and slave nodes respectively. An excerpt from these files in your question would be helpful.

Upvotes: 2

hartem
hartem

Reputation: 411

Could you please verify that there is connectivity between slaves and the masters? It might be a good idea to take a look at master and slave logs to see what's going on.

Upvotes: 2

Related Questions