Reputation: 600
We have a networking problem in docker-swarm. The problem is below;
Where should I check, any advices?
server-1:~$ docker version
Client:
Version: 17.03.0-ce
API version: 1.26
Go version: go1.7.5
Git commit: 3a232c8
Built: Tue Feb 28 08:01:32 2017
OS/Arch: linux/amd64
Server:
Version: 17.03.0-ce
API version: 1.26 (minimum version 1.12)
Go version: go1.7.5
Git commit: 3a232c8
Built: Tue Feb 28 08:01:32 2017
OS/Arch: linux/amd64
Experimental: true
ps: I checked this post but I have latest version of docker / docker-swarm so the issue should be fixed..
ps-2: similar problem; https://github.com/docker/swarm/issues/2687
Upvotes: 11
Views: 9418
Reputation: 3318
Tangentially related but this post shows up in searches for similar problems. If you arrive here trying to setup Docker swarms in AWS EC2 and have similar issues to the OP, you may need a specific Security Group rule for IPsec Protocol 50. This is applicable if you are using encrypted overlay networks.
Upvotes: 1
Reputation: 31
Resolution to the issue as mentioned above.
Use the following when you initializing the swarm
docker swarm init --advertise-addr=YOURIP --listen-addr=0.0.0.0 --data-path-port=7779 --force-new-cluster=true
Resources:
Docker:
VMWare:
Thanks @Izkuru
Upvotes: 2
Reputation: 354
"VTEP Port is reserved or restricted for VMware use, any virtual machine cannot use this port for other purpose or for any other application."
But we can change docker swarm data-path-port(the default port number 4789 is used) to another:
docker swarm init --data-path-port=7789
Upvotes: 19
Reputation: 71
Out of curiosity, in your VMware environment, do you have NSX deployed? I may have an answer, but it only applies if NSX is deployed in the environment.
ESXi will apparently drop OUTBOUND packets from VMs if the destination port is the same as the port configured for the VXLAN VTEP communication.
NSX utilizes port 4789/udp for VTEP communication for VXLAN (by default, as of 6.2.3; prior to that, it was 8472/udp). (If the VMs are on the same host, then traffic is not dropped, because, while it may be OUTBOUND traffic, it does not egress the host, and does not get to the same stage within the VMKernel to be dropped.)
The wording in KB2079386 is a little off. It states:
VXLAN port 8472 is reserved or restricted for VMware use, any virtual machine cannot use this port for other purpose or for any other application.
But, it should read:
VTEP Port is reserved or restricted for VMware use, any virtual machine cannot use this port for other purpose or for any other application.
If you are using NSX, you could try changing the port used for the VXLAN VTEPs, but port 4789/udp is required if you are going to leverage hardware VTEPs at all.
(I can't take full credit for this. I stumbled across this blog post talking about similar behavior when troubleshooting a similar issue.)
Upvotes: 7
Reputation: 1029
If your nodes are not on the same subnet (eg. they all have public IPs) - then make sure you use the --advertise-addr
option specifying the IP address that the other nodes can reach when that node (other managers AND workers) joins the swarm.
Otherwise the overlay network will not route correctly between hosts even though stack deployment & node registration etc appear to be working fine.
See the detailed explanation for my case in the same GitHub issue --> https://github.com/docker/swarm/issues/2687
Upvotes: 1
Reputation: 264831
The first thing I would check for overlay networking is your firewall rules. You need the following open between the hosts:
iptables -A INPUT -p 50 -j ACCEPT
)If that doesn't help, look into using netshoot to debug where the traffic is getting stopped.
Upvotes: 4