Reputation: 1674
I've set up a docker swarm mode cluster, with two managers and one worker. This is on Centos 7. They're on machines dkr1, dkr2, dkr3. dkr3 is the worker.
I was upgrading to v1.13 the other day, and wanted zero downtime. But it didn't work exactly as expected. I'm trying to work out the correct way to do it, since this is one of the main goals, of having a cluster.
The swarm is in 'global' mode. That is, one replica per machine. My method for upgrading was to drain the node, stop the daemon, yum upgrade, start daemon. (Note that this wiped out my daemon config settings for ExecStart=...! Be careful if you upgrade.)
Our client/ESB hits dkr2, which does its load balancing magic over the swarm. dkr2 which is the leader. dkr1 is 'reachable'
I brought down dkr3. No issues. Upgraded docker. Brought it back up. No downtime from bringing down the worker.
Brought down dkr1. No issue at first. Still working when I brought it down. Upgraded docker. Brought it back up. But during startup, it 404'ed. Once up, it was OK.
Brought down dkr2. I didn't actually record what happened then, sorry.
Anyway, while my app was starting up on dkr1, it 404'ed, since the server hadn't started yet.
Any idea what I might be doing wrong? I would suppose I need a health check of some sort, because the container is obviously ok, but the server isn't responding yet. So that's when I get downtime.
Upvotes: 0
Views: 209
Reputation: 545
You are correct -- You need to specify a healthcheck to run against your app inside the container in order to make sure it is ready. Your container will not receive traffic until this healtcheck has passed.
A simple curl to an endpoint should suffice. Use the Healthcheck flag in your Dockerfile to specify a healthcheck to perform.
An example of the healthcheck line in a Dockerfile to check if an endpoint returned 200 OK would be:
HEALTHCHECK CMD curl -f 'http://localhost:8443/somepath' || exit 1
If you can't modify your Dockerfile, then you can also specify your healthcheck manually at deployment time using the compose file healthcheck format.
If that's also not possible either and you need to update a running service, you can do a service update and use a combination of the health
flags to specify your healthcheck.
Upvotes: 1