How to deal with stale data when doing service discovery with etcd on CoreOS?

Question

I am currently tinkering with CoreOS and creating a cluster based upon it. So far, the experience with CoreOS on a single host is quite smooth. But things get a little hazy when it comes to service discovery. Somehow I don't get the overall idea, hence I am asking here now for help.

What I want to do is to have two Docker containers running where the first relies on the second. If we are talking pure Docker, I can solve this using linked containers. So far, so good.

But this approach does not work across machine boundaries, because Docker can not link containers across multiple hosts. So I am wondering how to do this.

What I've understand so far is that CoreOS's idea of how to deal with this is to use its etcd service, which is basically a distributed key-value-store that is accessible on each host locally via port 4001, so you do not have to deal (as a consumer of etcd) with any networking details: Just access localhost:4001 and you're fine.

So, in my head, I now have the idea that this means that when a Docker which provides a service spins up, it registers itself (i.e. its IP address and its port) in the local etcd, and etcd takes care of distributing the information across the network. This way, e.g. you get key-value pairs such as:

RedisService => 192.168.3.132:49236

Now, when another Docker container needs to access a RedisService, it gets the IP address and port from their very own local etcd, at least once the information has been distributed across the network. So far, so good.

But now I have a question that I can not answer, and that puzzles me already for a few days: What happens when a service goes down? Who cleans up the data inside of etcd? If it is not cleaned up, all the clients try to access a service that is no longer there.

The only (reliable) solution I can think of at the moment is making use of etcd's TTL feature for data, but this involves a trade-off: Either you have quite high network traffic, as you need to send a heartbeat every few seconds, or you have to live with stale data. Both is not fine.

The other, well, "solution" I can think of is to make a service deregister itself when it goes down, but this only works for planned shutdowns, not for crashes, power outeages, …

So, how do you solve this?

Rob · Accepted Answer

There are a few different ways to solve this: the sidekick method, using ExecStopPost and removing on failure. I'm assuming a trio of CoreOS, etcd and systemd, but these concepts could apply elsewhere too.

The Sidekick Method

This involves running a separate process next to your main application that heartbeats to etcd. On the simple side, this is just a for loop that runs forever. You can use systemd's BindsTo to ensure that when your main unit stops, this service registration unit stops too. In the ExecStop you can explicitly delete the key you're setting. We're also setting a TTL of 60 seconds to handle any ungraceful stoppage.

[Unit]
Description=Announce nginx1.service
# Binds this unit and nginx1 together. When nginx1 is stopped, this unit will be stopped too.
BindsTo=nginx1.service

[Service]
ExecStart=/bin/sh -c "while true; do etcdctl set /services/website/nginx1 '{ \"host\": \"10.10.10.2\", \"port\": 8080, \"version\": \"52c7248a14\" }' --ttl 60;sleep 45;done"
ExecStop=/usr/bin/etcdctl delete /services/website/nginx1

[Install]
WantedBy=local.target

On the complex side, this could be a container that starts up and hits a /health endpoint that your app provides to run a health check before sending data to etcd.

ExecStopPost

If you don't want to run something beside your main app, you can have etcdctl commands within your main unit to run on start and stop. Be aware, this won't catch all failures, as you mentioned.

[Unit]
Description=MyWebApp
After=docker.service
Require=docker.service
After=etcd.service
Require=etcd.service

[Service]
ExecStart=/usr/bin/docker run -rm -name myapp1 -p 8084:80 username/myapp command
ExecStop=/usr/bin/etcdctl set /services/myapp/%H:8084 '{ \"host\": \"%H\", \"port\": 8084, \"version\": \"52c7248a14\" }'
ExecStopPost=/usr/bin/etcdctl rm /services/myapp/%H:8084

[Install]
WantedBy=local.target

%H is a systemd variable that substitutes in the hostname for the machine. If you're interested in more variable usage, check out the CoreOS Getting Started with systemd guide.

Removing on Failure

On the client side, you could remove any instance that you have failed to connect to more than X times. If you get a 500 or a timeout from /services/myapp/instance1 you could run and keep increasing the failure count and then try to connect to other hosts in the /services/myapp/ directory.

etcdctl set /services/myapp/instance1 '{ \"host\": \"%H\", \"port\": 8084, \"version\": \"52c7248a14\", \"failures\": 1 }'

When you hit your desired threshold, remove the key with etcdctl.

Regarding the network traffic that heartbeating would cause – in most cases you should be sending this traffic over a local private network that your provider runs so it should be free and very fast. etcd is constantly heartbeating with its peers anyways, so this is just a little increase in traffic.

Hop into #coreos on Freenode if you have any other questions!

How to deal with stale data when doing service discovery with etcd on CoreOS?

Answers (1)

Related Questions