Kitinz
Kitinz

Reputation: 1542

Restart Rabbitmq cluster with different user

Let me explain my current scenario before formulating the questions:

Current Scenario

I have a rabbitmq cluster with 2 nodes created with root and I also have the web administration plugin installed which worked perfectly.

Few days ago one of the nodes went down because the consumers of some queues failed and millions of messages were accumulated, so rabbit collapsed, and wrote everything to disk (/var/lib/rabbitmq/mnesia/name_of_the_node/queues/), the filesystem filled up, and the whole node went down.

Problem/Questions

  1. After deleting all the messages in disk (I didn't need them anymore and had to free disk space), and restarting the node with rabbitmq-server -detached, the cluster kept working, but the administration plugin didn't response anymore, so, is there a way to make it work again without restarting?
  2. I'm planning to stop the whole cluster, and start it up again using rabbitmq user instead of root (just for security reasons), and I would like to know what things should i keep in mind in order to avoid issues. My main concern is if the cluster will keep/remember all the current configuration (users, policies, exchanges, queues and bindings) after starting it with rabbitmq user.
  3. I'm not really sure on how to do the restart to minimize the problems, and I also want to guarantee that the web admin plugin will work after the restart.

    Option 1:
    Stop all the nodes with root --> Start all the nodes with rabbitmq

    Option 2:
    Stop node1 with root --> Start node1 with rabbitmq
    Stop node2 with root --> Start node2 with rabbitmq

I'm also open to hear any other advice or suggestion you may have for me.

Upvotes: 0

Views: 1305

Answers (1)

  1. It is hard to answer your question without more information. You should at least take a look at the log files and/or post them somewhere.

  2. After you stopped a node running as root, change the entire /var/lib/rabbitmq ownership to rabbitmq:rabbitmq. Do the same with /var/log/rabbitmq. That's the only places where RabbitMQ writes data with the official packages and default configuration.

    Because it previously ran as root, Erlang stored its cookie, the shared secret "key" used to allow inter-node communication, in /root/.erlang.cookie. You need to copy it to /var/lib/rabbitmq/.erlang.cookie and fix ownership and permissions: it must be readable by the owner only, so a permission of 0400 or 0600; Erlang will complain if it's readable by the group or anyone.

  3. You can and should do it one node at a time (except if you updated Erlang or RabbitMQ in the meantime). Pay attention to the Erlang cookie I mentionned above. If you start a node with cookie different than the other running node, they will not be able to communicate.

    To ensure the cookie is correct before you restart RabbitMQ, you can try to ping the other running RabbitMQ node:

    # Open a shell as the `rabbitmq` user and run:
    erl -A0 -noinput -noshell -sname foobar \
     -eval "io:format(\"~p~n\", [net_adm:ping('rabbit@other-hostname')]), halt()."
    

    In the command line above, replace other-hostname with the hostname of the other RabbitMQ node. This command should print pong if everything is ok. If it displays pang, something is wrong.

Upvotes: 1

Related Questions