David Elner
David Elner

Reputation: 5201

Docker swarm will not deploy stacks, stuck at 0/1 replicas

Problem

I'm testing Docker Swarm and it won't deploy any stacks: after stack deploy, the replicas remain at 0/1.


It's a two-node cluster of 1 manager and 1 worker. The manager is set to drain, so that the stack will deploy to the worker.

Here's what I'm doing:

docker-compose.yml:

---
services:
  whoami:
    image: traefik/whoami

Then deploying the stack with:

test-portal:/app/whoami$ docker stack deploy -c docker-compose.yml whoami
Since --detach=false was not specified, tasks will be created in the background.
In a future release, --detach=false will become the default.
Creating network whoami_default
Creating service whoami_whoami

There are no containers deployed on either machine. The following can be observed:

test-portal:/app/whoami$ docker service ls
ID             NAME            MODE         REPLICAS   IMAGE                   PORTS
tsxrvw12zg6i   whoami_whoami   replicated   0/1        traefik/whoami:latest   

test-portal:/app/whoami$ docker network ls
NETWORK ID     NAME              DRIVER    SCOPE
149473faf294   bridge            bridge    local
76df1d9a4c91   docker_gwbridge   bridge    local
0939cda44322   host              host      local
n00f2g1whcn0   ingress           overlay   swarm
ee11daff62a5   none              null      local
uno0f18wnbbp   whoami_default              swarm

test-portal:/app/whoami$ docker network inspect uno
[
    {
        "Name": "whoami_default",
        "Id": "uno0f18wnbbp14jv0000wnnzs",
        "Created": "2025-03-02T14:38:34.136561408Z",
        "Scope": "swarm",
        "Driver": "",
        "EnableIPv4": false,
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "",
            "Options": null,
            "Config": null
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": null,
        "Options": null,
        "Labels": {
            "com.docker.stack.namespace": "whoami"
        }
    }
]

Potential clues:

Update #1

More experimentation shows:

This seems to suggest network configuration/subnets on Docker Swarm are part of the problem.

Update #2

I'm having some success tearing down and recreating swarms with docker swarm init --default-addr-pool 10.100.0.0/16 --default-addr-pool-mask-length 24 from this post. At least, the service comes up and addressable.

I don't understand why the /16 mask and changing the default mask length works... was this because creating with a /24 mask was too restrictive/having some kind of interference?

Still, the containers created in these new networks (e.g. 10.100.1.3 for whoami_default) seem unreachable from the manager or worker node itself. Do I still have a network misconfiguration? Shouldn't Docker be routing to this subnet?


Other environmental info:

test-portal:/app/whoami$ docker node ls
ID                            HOSTNAME                            STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
phg2egnc6m4o7vpf47osurp4s *   test-portal.private.network         Ready     Drain          Leader           28.0.1
wer4delycdryonov1l1u3tp4l     test-swarm-node-1.private.network   Ready     Active                          28.0.
test-portal:/app/whoami$ docker info
Client: Docker Engine - Community
 Version:    28.0.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.21.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.33.1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 2
 Server Version: 28.0.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: active
  NodeID: phg2egnc6m4o7vpf47osurp4s
  Is Manager: true
  ClusterID: yq0wosj1mkj28sbkv6vfoj42j
  Managers: 1
  Nodes: 2
  Default Address Pool: 10.1.99.0/24  
  SubnetSize: 24
  Data Path Port: 4789
  Orchestration:
   Task History Retention Limit: 5
  Raft:
   Snapshot Interval: 10000
   Number of Old Snapshots to Retain: 0
   Heartbeat Tick: 1
   Election Tick: 10
  Dispatcher:
   Heartbeat Period: 5 seconds
  CA Configuration:
   Expiry Duration: 3 months
   Force Rotate: 0
  Autolock Managers: false
  Root Rotation In Progress: false
  Node Address: 10.1.5.101
  Manager Addresses:
   10.1.5.101:2377
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: bcc810d6b9066471b0b6fa75f557a15a1cbf31bb
 runc version: v1.2.4-0-g6c52b3f
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.1.0-18-amd64
 Operating System: Debian GNU/Linux 12 (bookworm)
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 3.816GiB
 Name: test-portal.private.network
 ID: 9538b0fe-cc7d-4882-97cc-52c5f1d0bb92
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  ::1/128
  127.0.0.0/8
 Live Restore Enabled: false
test-portal:/app/whoami$ docker network inspect n0
[
    {
        "Name": "ingress",
        "Id": "n00f2g1whcn0wzsts4hvgvyzo",
        "Created": "2025-03-01T14:56:43.92160206-05:00",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv4": true,
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.1.99.0/24",
                    "Gateway": "10.1.99.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": true,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "ingress-sbox": {
                "Name": "ingress-endpoint",
                "EndpointID": "309e69924e66c572bbc690672cb8cd9ef466b8b4f3d7e8e9228d0ee7513f71ae",
                "MacAddress": "02:42:0a:01:63:02",
                "IPv4Address": "10.1.99.2/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4096"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "24b3935b696e",
                "IP": "10.1.5.101"
            },
            {
                "Name": "9ab40688269f",
                "IP": "10.1.5.102"
            }
        ]
    }
]

Upvotes: 1

Views: 27

Answers (0)

Related Questions