Reputation: 2687
I have installed and configured Mesos and Marathon. Whenever I try to schedule an application, it remains in 'Waiting' state which seems to indicate that Marathon is waiting for offers from Mesos.
When I check the logs in Mesos, I see the following:
I0425 20:22:10.313910 4279 master.cpp:2231] Received SUBSCRIBE call for framework 'chronos-2.4.0' at [email protected]:50892
I0425 20:22:10.313987 4279 master.cpp:2302] Subscribing framework chronos-2.4.0 with checkpointing enabled and capabilities [ ]
I0425 20:22:10.313994 4279 master.cpp:2312] Framework c16a5bfb-838e-4d43-bf3c-21bf94358ab5-0001 (chronos-2.4.0) at [email protected]:50892 already subscribed, resending acknowledgement
W0425 20:22:10.314007 4279 master.hpp:1764] Master attempted to send message to disconnected framework c16a5bfb-838e-4d43-bf3c-21bf94358ab5-0001 (chronos-2.4.0) at [email protected]:50892
E0425 20:22:10.314193 4287 process.cpp:1958] Failed to shutdown socket with fd 39: Transport endpoint is not connected
I0425 20:22:11.226884 4284 master.cpp:2231] Received SUBSCRIBE call for framework 'marathon' at [email protected]:35928
I0425 20:22:11.226959 4284 master.cpp:2302] Subscribing framework marathon with checkpointing enabled and capabilities [ ]
I0425 20:22:11.226969 4284 master.cpp:2312] Framework c16a5bfb-838e-4d43-bf3c-21bf94358ab5-0000 (marathon) at [email protected]:35928 already subscribed, resending acknowledgement
W0425 20:22:11.226982 4284 master.hpp:1764] Master attempted to send message to disconnected framework c16a5bfb-838e-4d43-bf3c-21bf94358ab5-0000 (marathon) at [email protected]:35928
E0425 20:22:11.227226 4287 process.cpp:1958] Failed to shutdown socket with fd 39: Transport endpoint is not connected
I0425 20:22:12.113598 4281 http.cpp:312] HTTP GET for /master/state from 192.0.2.1:49698 with User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36'
I0425 20:22:12.314221 4286 master.cpp:2231] Received SUBSCRIBE call for framework 'chronos-2.4.0' at [email protected]:50892
I0425 20:22:12.314304 4286 master.cpp:2302] Subscribing framework chronos-2.4.0 with checkpointing enabled and capabilities [ ]
I0425 20:22:12.314312 4286 master.cpp:2312] Framework c16a5bfb-838e-4d43-bf3c-21bf94358ab5-0001 (chronos-2.4.0) at [email protected]:50892 already subscribed, resending acknowledgement
W0425 20:22:12.314337 4286 master.hpp:1764] Master attempted to send message to disconnected framework c16a5bfb-838e-4d43-bf3c-21bf94358ab5-0001 (chronos-2.4.0) at [email protected]:50892
E0425 20:22:12.314524 4287 process.cpp:1958] Failed to shutdown socket with fd 39: Transport endpoint is not connected
I0425 20:22:13.081887 4284 master.cpp:2231] Received SUBSCRIBE call for framework 'marathon' at [email protected]:35928
I0425 20:22:13.081964 4284 master.cpp:2302] Subscribing framework marathon with checkpointing enabled and capabilities [ ]
I0425 20:22:13.081987 4284 master.cpp:2312] Framework c16a5bfb-838e-4d43-bf3c-21bf94358ab5-0000 (marathon) at [email protected]:35928 already subscribed, resending acknowledgement
W0425 20:22:13.082005 4284 master.hpp:1764] Master attempted to send message to disconnected framework c16a5bfb-838e-4d43-bf3c-21bf94358ab5-0000 (marathon) at [email protected]:35928
E0425 20:22:13.082314 4287 process.cpp:1958] Failed to shutdown socket with fd 39: Transport endpoint is not connected
I0425 20:22:13.221590 4282 master.cpp:2231] Received SUBSCRIBE call for framework 'marathon' at [email protected]:35928
I0425 20:22:13.221664 4282 master.cpp:2302] Subscribing framework marathon with checkpointing enabled and capabilities [ ]
I0425 20:22:13.221674 4282 master.cpp:2312] Framework c16a5bfb-838e-4d43-bf3c-21bf94358ab5-0000 (marathon) at [email protected]:35928 already subscribed, resending acknowledgement
W0425 20:22:13.221688 4282 master.hpp:1764] Master attempted to send message to disconnected framework c16a5bfb-838e-4d43-bf3c-21bf94358ab5-0000 (marathon) at [email protected]:35928
E0425 20:22:13.222162 4287 process.cpp:1958] Failed to shutdown socket with fd 39: Transport endpoint is not connected
I0425 20:22:14.412215 4286 master.cpp:2231] Received SUBSCRIBE call for framework 'marathon' at [email protected]:35928
I0425 20:22:14.412281 4286 master.cpp:2302] Subscribing framework marathon with checkpointing enabled and capabilities [ ]
I0425 20:22:14.412289 4286 master.cpp:2312] Framework c16a5bfb-838e-4d43-bf3c-21bf94358ab5-0000 (marathon) at [email protected]:35928 already subscribed, resending acknowledgement
W0425 20:22:14.412302 4286 master.hpp:1764] Master attempted to send message to disconnected framework c16a5bfb-838e-4d43-bf3c-21bf94358ab5-0000 (marathon) at [email protected]:35928
E0425 20:22:14.412495 4287 process.cpp:1958] Failed to shutdown socket with fd 39: Transport endpoint is not connected
Any idea as to why it mentions a 'disconnected' framework. In Mesos, I can see the 3 slaves and the Marathon (and Chronos) framework are mentioned in the 'active frameworks'.
The /etc/hosts mention the following entries:
192.0.2.11 master1 # VAGRANT: cd38e81ab8742b23dfbcb913468368ea (master1) / 1b611425-dbad-4bd0-8727-4169c09ec045
192.0.2.51 slave1 # VAGRANT: 94630539b67d178dddffda29a0313a75 (slave1) / 1a1694de-2bd2-4d96-bdf2-dd6767d1f310
192.0.2.52 slave2 # VAGRANT: 306e67b33b327b3d1c9990bf1316a321 (slave2) / bdbd677e-5298-4d49-90a8-e521139dd127
192.0.2.12 master2 # VAGRANT: fb338e9e9c001a5bfab605387ba88d02 (master2) / bdccfd80-b1e6-48a0-8986-b24c7cbd7a25
192.0.2.53 slave3 # VAGRANT: 3913b3358eadc90c622859ddb90bfede (slave3) / 786cbe69-2af5-43b7-8e70-d6cc07d4ddf4
192.0.2.13 master3 # VAGRANT: 92cdd6e36a6c0391e2a66f73661e56fe (master3) / 03bb2c16-f474-4412-b8f4-fce82e12955c
Note: in case more info is needed on how the cluster was installed, please refer to this
Upvotes: 2
Views: 1077
Reputation: 883
You can also set LIBPROCESS_IP
as environment variable. I think this is better than changing the /etc/hosts
.
Found the solution here: https://groups.google.com/forum/#!topic/marathon-framework/1qboeZTOLU4
Upvotes: 2
Reputation: 31479
I guess you need to make sure that the hostnames are resolvable to actual IP addresses.
That's at least what fixed my problems when Marathon etc. tried to bind to 127.0.1.1
on Ubuntu. I.e. you should add on each host the IP to hostname mappings, e.g.
192.0.2.11 master1
entry in the /etc/hosts
file either before the mapping of the 127.0.1.1
to the hostname, or remove the 127.0.1.1
entry entirely. The Vagrant plugin vagrant-hostsupdater might help.
Upvotes: 1