Reputation: 349
I'm trying to run a test cluster locally following this guide https://mesosphere.com/2014/07/07/installing-mesos-on-your-mac-with-homebrew/
Currently, I'm able to have a master running at localhost:5050 and a slave running at the default port 5051 (with slave id say S0). However, when I tried to start another slave at a different port, it re-registered itself as S0 and the master console only showed 1 activated slave. Does anybody know how would I start another slave S1? Thanks!
Upvotes: 0
Views: 810
Reputation: 3726
Did you specify a another work_dir? E.g.
sudo /usr/local/sbin/mesos-slave --master=localhost:5050 --port=5052 -- work_dir=/tmp/mesos2
To explain a bit why this is needed/ where the error you saw came from. Mesos supports so called slave recovery for helping with upgrades and error recovery.
Therefore when starting a slave, it will check its work_dir for checkpoint and try to recover that state (i.e. reconnect to still running executors). In your case as both slaves wanted to start from the same working directory, the second one tried to recover the checkpoint of the still running first slave...
P.S. I should probably replace all the above occurences of slave with worker (https://issues.apache.org/jira/browse/MESOS-1478), but I hope this is easier to read.
Upvotes: 6