Seer.The
Seer.The

Reputation: 486

RabbitMQ wont cluster (nxdomain)

I want to set up 2 rabbitmq servers to work in cluster. When when trying to run

rabbitmqctl join_cluster rabbit@my_rabbit_1.my.domain.name on my_rabbit_1

I get unable to connect to epmd (port 4369) on my_rabbit_2.my.domain.name: nxdomain (non-existing domain)

I use rabbitmq:latest (debian), .erlang.cookie is the same, hosts resolve fine: I can ping both directions, nmap -6 -p 4369 my_rabbit_2.my.domain.nam returns 4369/tcp open epmd

EDIT:

tcpdump shows that while resolving hostname, rabbit or epmd performs not 2 types of DNS query: AAAA for IPv6 and A for IPv4 address, but only IPv4 which fails repeatedly with nxdomain as there is no IPv4 address available. However, it does not try AAAA DNS query, except when trying to run command like rabbitmq -n [email protected]: then it runs AAAA query and outputs successfully. Hence the problem. How do I solve that?

Upvotes: 1

Views: 6632

Answers (2)

Seer.The
Seer.The

Reputation: 486

Finally found solution that worked for me. Erlang documentation says that, by default, -proto_dist specifies a protocol for Erlang distribution, which defaults to inet_tcp (TCP over IPv4). So in IPv6-only environment you have to set -proto_dist inet6_tcp flag for erl.

This can be done by adding the following lines to your rabbitmq-env.conf (see RabbitMQ configuration docs):

# For rabbitmq-server
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="-proto_dist inet6_tcp"
# For rabbitmqctl
RABBITMQ_CTL_ERL_ARGS="-proto_dist inet6_tcp"

Note that rabbitmqctl and rabbitmq-server use different erl settings: I was unable to create cluster without RABBITMQ_CTL_ERL_ARGS="-proto_dist inet6_tcp" setting using rabbitmqctl join_cluster [email protected]. It should not be necessary in production mode. Also note that RabbitMQ configuration docs advice against using this setting except for debugging.

Upvotes: 1

mcfinnigan
mcfinnigan

Reputation: 11638

unable to connect to epmd (port 4369) on my_rabbit_2.my.domain.name: nxdomain (non-existing domain)

This is an error raised when the rabbitmq server is running on a hostname other than what you think it is running on, or when hostname doesn't resolve to what you think it does.

Amusingly enough I had this exact same issue last night when one instance in our cluster failed, came back on a new hostname, and somehow corrupted its internal authentication store etc.

Without the exact dns entries etc for your setup, all I can offer is general troubleshooting steps.

See this StackOverflow question for a resolution that may help you - in particular the answer by Kishor Pawar.

Are you sure you configured rabbitmq to listen on IPV6? Is there a reason you can't bind it to IPV4 as well on 127.0.0.1 for management operations?

Upvotes: 0

Related Questions