Reputation: 685
I'm trying to start 3 zookeeper services on the same host on my development computer. This is obviously not something I'll do in production, I'm doing this to explore fault tolerance and Kafka dependency in my test/development computer.
I have installed Kafka 2.5.0 on my dev computer, and I was able to successfully set 3 Kafka services on the host with 1 Zookeeper service on the same host. Using the zookeeper scripts and package that comes with the kafka package.
The problems started when tried to set 3 zookeeper services... I did the following to set 3 zookeeper services but I'm not able to successfully start the servics. I have 3 config files:
config/zookeeper.properties
config/zookeeper1.properties
config/zookeeper2.properties
The content of config/zookeeper.properties is:
dataDir=/tmp/zookeeper
clientPort=2181
maxClientCnxns=0
admin.enableServer=false
initLimit=5
syncLimit=2
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890
on config/zookeeper1.properties
I have clientPort=2182
and dataDir=/tmp/zookeeper1
on config/zookeeper2.properties
I have clientPort=2183
and dataDir=/tmp/zookeeper2
also created the files /tmp/zookeeper/myid, /tmp/zookeeper1/myid, /tmp/zookeeper2/myid and entered id values text 1, 2, 3 respectivly.
when starting the 3 zookeepers from the command line, they start ok:
$ sudo bin/zookeeper-server-start.sh config/zookeeper.properties
$ sudo bin/zookeeper-server-start.sh config/zookeeper1.properties
$ sudo bin/zookeeper-server-start.sh config/zookeeper2.properties
and I can also see who the leader and followers are by:
$ echo srvr | nc localhost 2181 | grep Mode
Mode: follower
$ echo srvr | nc localhost 2182 | grep Mode
Mode: leader
$ echo srvr | nc localhost 2183 | grep Mode
Mode: follower
But when I try setting them up as system services I'm unable to start them properly... here's the unit files I have:
$ cat /etc/systemd/system/zookeeper.service
[Unit]
Description=zookeeper
After=syslog.target network.target
[Service]
Type=simple
User=kafka
Group=kafka
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
[Install]
WantedBy=multi-user.target
$ cat /etc/systemd/system/zookeeper1.service
[Unit]
Description=zookeeper 1
After=syslog.target network.target
[Service]
Type=simple
User=kafka
Group=kafka
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper1.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop1.sh
[Install]
WantedBy=multi-user.target
$ cat /etc/systemd/system/zookeeper2.service
[Unit]
Description=zookeeper 2
After=syslog.target network.target
[Service]
Type=simple
User=kafka
Group=kafka
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper2.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop2.sh
[Install]
WantedBy=multi-user.target
After trying to start them with
$ sudo systemctl daemon-reload
$ sudo systemctl enable zookeeper
$ sudo systemctl enable zookeeper1
$ sudo systemctl enable zookeeper2
$ sudo systemctl start zookeeper
$ sudo systemctl start zookeeper1
$ sudo systemctl start zookeeper2
I dont see that they run...
On systemlog I see this:
May 17 03:56:20 melly-dev2 kafka-server-start.sh: [2020-05-17 03:56:20,039] INFO Opening socket connection to server localhost/127.0.0.1:2183. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
May 17 03:56:20 melly-dev2 kafka-server-start.sh: [2020-05-17 03:56:20,040] INFO Socket error occurred: localhost/127.0.0.1:2183: Connection refused (org.apache.zookeeper.ClientCnxn)
Here's wgat I see on sudo journalctl -u zookeeper.service
:
[2020-05-17 06:33:33,096] INFO Notification time out: 6400 (org.apache.zookeeper.server.quorum.FastLeaderElection)
[2020-05-17 06:33:39,497] WARN Cannot open channel to 2 at election address localhost/127.0.0.1:3889 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:650)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:707)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:735)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:910)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1247)
[2020-05-17 06:33:39,497] WARN Cannot open channel to 3 at election address localhost/127.0.0.1:3890 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:650)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:707)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:735)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:910)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1247)
[2020-05-17 06:33:39,497] INFO Notification time out: 12800 (org.apache.zookeeper.server.quorum.FastLeaderElection)
How can set/find the zookeeper log files, and how can I make zookeeper successfully start as a service?
Upvotes: 1
Views: 558
Reputation: 685
The missing step in my procedure was:
sudo chown -R kafka:kafka /tmp/zookeeper
sudo chown -R kafka:kafka /tmp/zookeeper1
sudo chown -R kafka:kafka /tmp/zookeeper2
sudo chmod -R 777 /tmp/zookeeper
sudo chmod -R 777 /tmp/zookeeper1
sudo chmod -R 777 /tmp/zookeeper2
One of the issues was that the default zookeeper that comes with Kafka has no log that shows the write error. Once I did this command (based on GiorgosMyrianthous comment):
journalctl -u zookeeper.service
I could clearly see the error and fix the problem.
Upvotes: 0
Reputation: 39810
[2020-05-17 07:02:35,248] ERROR Unable to access datadir, exiting abnormally (org.apache.zookeeper.server.quorum.QuorumPeerMain)
org.apache.zookeeper.server.persistence.FileTxnSnapLog$DatadirException: Cannot write to data directory /tmp/zookeeper1/version-2
I suspect that the user melly-dev2
does not have access to write logs under /tmp/zookeeper/
.
Also, make sure to change dataDir
to a permanent location (i.e. not under /tmp/
) as everything will be lost once your machine turns off.
Upvotes: 1