Reputation: 803
I have a cookbook that will completely configure a client node to use consul with the exception of joining the client to the cluster. The following command is failing:
execute "join consul" do
command "/usr/local/bin/consul join #{consul_server}"
action :nothing
end
Running what I think is the same command on the instance itself however works after the cookbook has failed:
/usr/local/bin/consul join server-001.flapjacks.com
The service is set up via a systemd script that sets the config to: /etc/consul.d
[Unit]
Description="HashiCorp Consul - A service mesh solution"
Documentation=https://www.consul.io/
Requires=network-online.target
After=network-online.target
ConditionFileNotEmpty=/etc/consul.d/consul.hcl
[Service]
User=root
Group=root
ExecStart=/usr/local/bin/consul agent -config-dir /etc/consul.d
ExecReload=/usr/local/bin/consul reload
KillMode=process
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
When checked the service is running properly.
The execute resource is triggered by a template resource:
template '/etc/consul.d/webserver.json' do
source 'webserver.json.erb'
owner 'root'
group 'root'
mode '0644'
action :create
notifies :restart, resources(:service => "consul")
notifies :run, "execute[join consul]"
end
The chef run output errors with this:
STDERR: Error joining address 'server-001.flapjacks.com': Put http://127.0.0.1:8500/v1/agent/join/server-001.flapjacks.com: dial tcp 127.0.0.1:8500: connect: connection refused
Failed to join any nodes.
Any ideas on why this is not working?
Upvotes: 0
Views: 289
Reputation: 921
It suggests that consul itself on the node is not running. I see it could be because you do notify the consul restart on template change and both the notifies - consul restart - join execute runs parallely and thus the join commands fails complaining about consul not reachable on localhost itself.
Upvotes: 0