Unable to get docker containers running disque to establish a cluster

Question

I put together a docker container building disque https://registry.hub.docker.com/u/jobflow/disque/

I am able to deploy and run a single service. I can expose a port and connect to it from a disque client running on the same container or a different container.

But when connect to one instance and send the cluster meet with the ip and port of the other container it will attempt to cluster the instance but eventually fail.

I can link two containers and they will cluster fine, but you can not link a container to more than one.

I can run multiple disque service in a single container and they will cluster ok. It's just cross container communication that fails.

For Example:

3 disque servers running on 3 different containers
CONTAINER ID        IMAGE                   COMMAND                CREATED             STATUS              PORTS                     NAMES
0e31c5b751b5        jobflow/disque:latest   "/bin/sh -c 'disque-   5 minutes ago       Up 5 minutes        0.0.0.0:32770->7711/tcp   disque-3
d48ec8e588d5        jobflow/disque:latest   "/bin/sh -c 'disque-   5 minutes ago       Up 5 minutes        0.0.0.0:32769->7711/tcp   disque-2
8ee7ec27d210        jobflow/disque:latest   "/bin/sh -c 'disque-   10 minutes ago      Up 10 minutes       0.0.0.0:32768->7711/tcp   disque

Connect to server

# disque -h 192.168.99.100 -p 32768
192.168.99.100:32768>

Join Clusters

192.168.99.100:32768> cluster meet 192.168.99.100 32768
OK
192.168.99.100:32768>

Looks like it worked

192.168.99.100:32768> cluster info
cluster_state:ok
cluster_known_nodes:2
cluster_reachable_nodes:1
cluster_size:0
cluster_stats_messages_sent:171
cluster_stats_messages_received:0
192.168.99.100:32768>

Wait a bit.... Nope :(

192.168.99.100:32768> cluster info
cluster_state:ok
cluster_known_nodes:1
cluster_reachable_nodes:1
cluster_size:0
cluster_stats_messages_sent:296
cluster_stats_messages_received:0
192.168.99.100:32768>

Let's checkout the log (set both servers to debug)

192.168.99.100:32768> CONFIG SET loglevel debug
OK

7:P 17 Jun 21:11:25.357 * No cluster configuration found, I'm 3eb248db697774d0fa15e06ffcbf17f71767d4a0
                                        Disque 0.0.1 (00000000/0) 64 bit
          _ -                                                        
        .                               Port: 7711
        .    o    .                     PID: 7
                 .                                                   
               -                              http://disque.io       


7:P 17 Jun 21:11:25.398 # Server started, Disque version 0.0.1
7:P 17 Jun 21:11:25.399 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
7:P 17 Jun 21:11:25.399 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
7:P 17 Jun 21:11:25.399 * The server is now ready to accept connections on port 7711
7:P 17 Jun 22:42:20.608 - 1 clients connected, 724632 bytes in use
7:P 17 Jun 22:42:25.695 - 1 clients connected, 724632 bytes in use
7:P 17 Jun 22:42:30.790 - 1 clients connected, 724632 bytes in use
7:P 17 Jun 22:42:35.901 - 1 clients connected, 724632 bytes in use
7:P 17 Jun 22:42:40.988 - 1 clients connected, 724632 bytes in use
7:P 17 Jun 22:42:46.069 - 1 clients connected, 724632 bytes in use

# disque -h 192.168.99.100 -p 32769
192.168.99.100:32769> CONFIG SET loglevel debug
OK


6:P 17 Jun 21:15:58.906 * No cluster configuration found, I'm cb52f64739b801286dbd76ceb8801ae38d43384e
                                        Disque 0.0.1 (00000000/0) 64 bit
          _ -                                                        
        .                               Port: 7711
        .    o    .                     PID: 6
                 .                                                   
               -                              http://disque.io       


6:P 17 Jun 21:15:58.920 # Server started, Disque version 0.0.1
6:P 17 Jun 21:15:58.920 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
6:P 17 Jun 21:15:58.921 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
6:P 17 Jun 21:15:58.921 * The server is now ready to accept connections on port 7711
6:P 17 Jun 22:44:52.162 - 1 clients connected, 724632 bytes in use
6:P 17 Jun 22:44:57.240 - 1 clients connected, 724632 bytes in use
6:P 17 Jun 22:45:02.318 - 1 clients connected, 724632 bytes in use
6:P 17 Jun 22:45:07.406 - 1 clients connected, 724632 bytes in use
6:P 17 Jun 22:45:12.482 - 1 clients connected, 724632 bytes in use
6:P 17 Jun 22:45:17.570 - 1 clients connected, 724632 bytes in use

Let's introduce the clusters again and see whats up

# disque -h 192.168.99.100 -p 32768
192.168.99.100:32768> cluster meet 192.168.99.100 32769
OK

7:P 17 Jun 22:47:05.553 - 1 clients connected, 724600 bytes in use
7:P 17 Jun 22:47:09.838 . Connecting with Node 555abc8b9b37044e10e0a61fc28a3ce15b564696 at 192.168.99.100:42769
7:P 17 Jun 22:47:09.838 . I/O error reading from node link: Connection refused
7:P 17 Jun 22:47:09.940 . Connecting with Node 555abc8b9b37044e10e0a61fc28a3ce15b564696 at 192.168.99.100:42769
7:P 17 Jun 22:47:09.940 . I/O error reading from node link: Connection refused
7:P 17 Jun 22:47:10.041 . Connecting with Node 555abc8b9b37044e10e0a61fc28a3ce15b564696 at 192.168.99.100:42769
7:P 17 Jun 22:47:10.041 . I/O error reading from node link: Connection refused
7:P 17 Jun 22:47:10.141 . Connecting with Node 555abc8b9b37044e10e0a61fc28a3ce15b564696 at 192.168.99.100:42769
7:P 17 Jun 22:47:10.141 . I/O error reading from node link: Connection refused
7:P 17 Jun 22:47:10.244 . Connecting with Node 555abc8b9b37044e10e0a61fc28a3ce15b564696 at 192.168.99.100:42769   
7:P 17 Jun 22:47:24.673 . I/O error reading from node link: Connection refused
7:P 17 Jun 22:47:24.774 . Connecting with Node 555abc8b9b37044e10e0a61fc28a3ce15b564696 at 192.168.99.100:42769
7:P 17 Jun 22:47:24.774 . I/O error reading from node link: Connection refused
7:P 17 Jun 22:47:25.895 - 1 clients connected, 724632 bytes in use
7:P 17 Jun 22:47:30.979 - 1 clients connected, 724632 bytes in use
7:P 17 Jun 22:47:36.083 - 1 clients connected, 724632 bytes in use

WAT?

Connecting with Node 555abc8b9b37044e10e0a61fc28a3ce15b564696 at 192.168.99.100:42769

I passed in

192.168.99.100:32769

not

192.168.99.100:42769

hmm ok then lets pass

192.168.99.100:32768> cluster meet 192.168.99.100 22769
OK  

 7:P 17 Jun 22:52:08.996 . Connecting with Node 540d0f780c5bf04df56c89f25315743116f20e92 at 192.168.99.100:32769

That seems to work. But ...

192.168.99.100:32768> cluster info
cluster_state:ok
cluster_known_nodes:1
cluster_reachable_nodes:1
cluster_size:0
cluster_stats_messages_sent:445
cluster_stats_messages_received:0

antirez · Accepted Answer

The problem should be due to the fact that Docker uses port forwarding, which is not compatible with the way Disque works currently.

However you can disable port forwarding in Docker using a 1:1 mapping, with something like this:

$ docker run -d -p 7711:7711 ...

When this will be fixed in Redis Cluster I'll also back port the fix in Disque. The fix would be to make Disque instances able to report an IP/port pair different than the one the nodes will sense via automatic detection using the getpeeraddr system call.

Unable to get docker containers running disque to establish a cluster

Answers (1)

Related Questions