MarkH
MarkH

Reputation: 702

How to deploy consul using Docker 1.12 swarm mode

I have a consul cluster of 3 servers. I also have a docker swarm of around 6 workers and 3 masters (the masters are on the same hardware as the consul servers but are set with availability == drain to prevent them accepting work).

I generally use consul-template to read consul K/V. I cannot for the life of me work out how to sensibly roll out a consul agent service. If I use a global service then I get one agent per node but the server cluster complains because the client agents all appear to have the same IP address.

Replicated services seem to be the way to go, but I believe I need to publish the client port 8301 and that seems to cause a clash with my server cluster (which is running both swarm master and consul servers (not under docker).

I'd appreciate a general steer in the right direction - bearing in mind this is 1.12 swarm mode and therefore very different from earlier versions.

Upvotes: 12

Views: 8697

Answers (4)

thechane
thechane

Reputation: 359

For those like me that prefer to run our services from docker-compose.yml files, I managed to "docker stack deploy"

https://github.com/thechane/consul/blob/master/docker-compose.yml

... to run Consul as a Docker service.

--- EDIT , poor form to just answer with links so here it is:

version: '3.1'
#customise this with options from
#https://www.consul.io/docs/agent/options.html

services:

seed:
  hostname: seed
  image: consul:0.8.0
  deploy:
    restart_policy:
      condition: none  #we do not want this to be restarted on timeout (see entrypoint options below)
    replicas: 1
    placement:
      constraints:
        - "engine.labels.access == temp"
        - "engine.labels.access != consul"
  environment:
    - "CONSUL_LOCAL_CONFIG={\"disable_update_check\": true}"
    - "CONSUL_BIND_INTERFACE=eth0"
  entrypoint:
    - timeout     #this seed fires up the cluster after which it is no longer needed
    - -sTERM      #this is the same signal as docker would send on a scale down / stop
    - -t300       #terminate after 5 mins
    - consul
    - agent
    - -server
    - -bootstrap-expect=5
    - -data-dir=/tmp/consuldata
    - -bind={{ GetInterfaceIP "eth0" }}
  networks:
    - "consul"

cluster:
  image: consul:0.8.0
  depends_on:
    - "seed"
  deploy:
    mode: global                                      ##this will deploy to all nodes that
    placement:
      constraints:
        - "engine.labels.access == consul"            ##have the consul label
        - "engine.labels.access != temp"
  environment:
    - "CONSUL_LOCAL_CONFIG={\"disable_update_check\": true}"
    - "CONSUL_BIND_INTERFACE=eth0"
    - "CONSUL_HTTP_ADDR=0.0.0.0"
  entrypoint:
    - consul
    - agent
    - -server
    - -data-dir=/tmp/consuldata
    - -bind={{ GetInterfaceIP "eth0" }}
    - -client=0.0.0.0
    - -retry-join=seed:8301
    - -ui                                              ##assuming you want the UI on
  networks:
    - "consul"
  ports:
    - "8500:8500"
    - "8600:8600"

networks:
  consul:
    driver: overlay

Also note, I later discovered that without the seed more consul instances can not be added. So if you intend to expand your swarm node count I'd remove the timeout command with its options from the seed entrypoint.

Upvotes: 3

Jared Mackey
Jared Mackey

Reputation: 4158

In my blog I explore a similar way to MarkH's answer but the key difference is that instead of pointing to the VIP of the new servers, I am pointing to the first three nodes that join the network. This can be beneficial due to the VIP has issues where it will point to itself versus load balancing that across all the nodes on that VIP. In my experience it was better to do it this way for the service creation.

docker service create \
  --network=consul \
  --name=consul \
  -e 'CONSUL_LOCAL_CONFIG={"skip_leave_on_interrupt": true}' \ 
  -e CONSUL_BIND_INTERFACE='eth0' \
  --mode global \
  -p 8500:8500 \ 
  consul agent -server -ui -client=0.0.0.0 \
  -bootstrap-expect 3 \
  -retry-join 172.20.0.3 \
  -retry-join 172.20.0.4 \
  -retry-join 172.20.0.5 \
  -retry-interval 5s

I am using global mode here in a 3 node swarm so you can swap that out for replicas and put your constraints.

Upvotes: 3

MarkH
MarkH

Reputation: 702

After much deliberation and many dead ends, we finally came up with a solution that works for us. Part of the problem is that at the time of writing, Docker 1.12 is somewhat juvenile and introduces a number of concepts that have to be understood before it all makes sense. In our case, our previous experiences with pre 1.12 variants of Swarm have hindered our forward thinking rather than helped.

The solution we utilised to deploy a consul K/V service for our swarm goes as follows

  1. Create an overlay network called 'consul'. This creates an address space for our service to operate within.

    docker network create --driver overlay --subnet 10.10.10.0/24 consul

  2. Deploy the consul server cluster into the new overlay. We have three hosts that we use as manager nodes and we wanted the consul server containers to run on this cluster rather than the app servers hence the 'constraint' flag

    docker service create -e 'CONSUL_LOCAL_CONFIG={"leave_on_terminate": true}' --name consulserver --network consul --constraint 'node.role == manager' --replicas 3 consul agent server -bootstrap-expect=3 -bind=0.0.0.0 -retry-join="10.10.10.2" -data-dir=/tmp

    The key here is that swarm will allocate a new VIP (10.10.10.2) at the start of the consul network that maps onto the three new instances.

  3. Next we deployed an agent service

    docker service create \ -e 'CONSUL_BIND_INTERFACE=eth0' \ -e 'CONSUL_LOCAL_CONFIG={"leave_on_terminate": true, "retry_join":["10.10.10.2"]}' \ --publish "8500:8500" \ --replicas 1 \ --network consul \ --name consulagent \ --constraint 'node.role != manager' \ consul agent -data-dir=/tmp -client 0.0.0.0

Specifying the VIP of the consulserver service. (Consul won't resolve names for join - other containers may do better, allowing the service name "consulserver" to be specified rather than the VIP)

This done, any other service can access the consulagent by being joined to the consul network, and resolving the name "consulagent". The consulagent service can be scaled (or maybe deployed as a global service) as required. Publishing port 8500 makes the service available at the edge of the swarm and could be dropped if you didnt need to make it available to non swarm services.

Upvotes: 6

Bernard
Bernard

Reputation: 17261

It's confusing but Docker "Swarm Mode" is really a different animal that what is still called Docker Swarm. In Swarm Mode you don't need Consul. The docker daemon on each host acts as the key value store and does the service discovery. It does everything for what Consul is needed in the "old" Docker Swarm.

Just be careful to look for documentation/info that is specific to "swarm mode" only. I wish they had used a different name for it actually.

Upvotes: 6

Related Questions