How to enable faster container rescheduling with Docker Swarm and Consul?

Question

For some background on my environment:

I have docker swarm running on 3 ubuntu 14.04 vagrant boxes. The swarm master is running on 1 machine (with consul) and the other 2 machines are running swarm workers that are joined to the master. I set up the environment following the documentation page https://docs.docker.com/swarm/install-manual/. It is working correctly so that any docker -H :4000 run from my master machine works fine. Service discovery is active as I am running the gliderlabs/registrator container on both of my workers.

The issue:

Any changes to my cluster, such as a node or container failure and the process of rescheduling containers (which are created with the tag -e "reschedule:on-node-failure") by swarm occur within about 30 - 45 seconds. By comparison when I was running fleet and etcd on CoreOS systems container rescheduling and notification of node failures occurred usually within about 5 seconds. Is there any way to change some of the settings within consul and docker swarm to speed everything up to a level similar to what I experienced with fleet and etcd on CoreOS? If so what would I need to do?

tldr: I am running swarm with consul, container reschedualing and changes to the output ofdocker -H :4000 ps don't occur untill about 30 - 45 seconds after a node goes down. How can I reduce this time period?

How to enable faster container rescheduling with Docker Swarm and Consul?

Answers (1)

Related Questions