MSE
MSE

Reputation: 685

Unable to start multiple Zookeeper services on Kafka dev box

I'm trying to start 3 zookeeper services on the same host on my development computer. This is obviously not something I'll do in production, I'm doing this to explore fault tolerance and Kafka dependency in my test/development computer.

I have installed Kafka 2.5.0 on my dev computer, and I was able to successfully set 3 Kafka services on the host with 1 Zookeeper service on the same host. Using the zookeeper scripts and package that comes with the kafka package.

The problems started when tried to set 3 zookeeper services... I did the following to set 3 zookeeper services but I'm not able to successfully start the servics. I have 3 config files:

config/zookeeper.properties
config/zookeeper1.properties
config/zookeeper2.properties

The content of config/zookeeper.properties is:

dataDir=/tmp/zookeeper
clientPort=2181
maxClientCnxns=0
admin.enableServer=false
initLimit=5
syncLimit=2
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890

on config/zookeeper1.properties I have clientPort=2182 and dataDir=/tmp/zookeeper1

on config/zookeeper2.properties I have clientPort=2183 and dataDir=/tmp/zookeeper2

also created the files /tmp/zookeeper/myid, /tmp/zookeeper1/myid, /tmp/zookeeper2/myid and entered id values text 1, 2, 3 respectivly.

when starting the 3 zookeepers from the command line, they start ok:

$ sudo bin/zookeeper-server-start.sh config/zookeeper.properties
$ sudo bin/zookeeper-server-start.sh config/zookeeper1.properties
$ sudo bin/zookeeper-server-start.sh config/zookeeper2.properties

and I can also see who the leader and followers are by:

$ echo srvr | nc localhost 2181 | grep Mode
Mode: follower
$ echo srvr | nc localhost 2182 | grep Mode
Mode: leader
$ echo srvr | nc localhost 2183 | grep Mode
Mode: follower

But when I try setting them up as system services I'm unable to start them properly... here's the unit files I have:

$ cat /etc/systemd/system/zookeeper.service
[Unit]
Description=zookeeper
After=syslog.target network.target
[Service]
Type=simple
User=kafka
Group=kafka
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
[Install]
WantedBy=multi-user.target

$ cat /etc/systemd/system/zookeeper1.service
[Unit]
Description=zookeeper 1
After=syslog.target network.target
[Service]
Type=simple
User=kafka
Group=kafka
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper1.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop1.sh
[Install]
WantedBy=multi-user.target

$ cat /etc/systemd/system/zookeeper2.service
[Unit]
Description=zookeeper 2
After=syslog.target network.target
[Service]
Type=simple
User=kafka
Group=kafka
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper2.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop2.sh
[Install]
WantedBy=multi-user.target

After trying to start them with

$ sudo systemctl daemon-reload
$ sudo systemctl enable  zookeeper
$ sudo systemctl enable  zookeeper1
$ sudo systemctl enable  zookeeper2
$ sudo systemctl start zookeeper
$ sudo systemctl start zookeeper1
$ sudo systemctl start zookeeper2

I dont see that they run...

On systemlog I see this:

May 17 03:56:20 melly-dev2 kafka-server-start.sh: [2020-05-17 03:56:20,039] INFO Opening socket connection to server localhost/127.0.0.1:2183. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
May 17 03:56:20 melly-dev2 kafka-server-start.sh: [2020-05-17 03:56:20,040] INFO Socket error occurred: localhost/127.0.0.1:2183: Connection refused (org.apache.zookeeper.ClientCnxn)

Here's wgat I see on sudo journalctl -u zookeeper.service :

[2020-05-17 06:33:33,096] INFO Notification time out: 6400 (org.apache.zookeeper.server.quorum.FastLeaderElection)
[2020-05-17 06:33:39,497] WARN Cannot open channel to 2 at election address localhost/127.0.0.1:3889 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:650)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:707)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:735)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:910)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1247)
[2020-05-17 06:33:39,497] WARN Cannot open channel to 3 at election address localhost/127.0.0.1:3890 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:650)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:707)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:735)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:910)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1247)
[2020-05-17 06:33:39,497] INFO Notification time out: 12800 (org.apache.zookeeper.server.quorum.FastLeaderElection)

How can set/find the zookeeper log files, and how can I make zookeeper successfully start as a service?

Upvotes: 1

Views: 558

Answers (2)

MSE
MSE

Reputation: 685

The missing step in my procedure was:

sudo chown -R kafka:kafka /tmp/zookeeper
sudo chown -R kafka:kafka /tmp/zookeeper1
sudo chown -R kafka:kafka /tmp/zookeeper2

sudo chmod -R 777 /tmp/zookeeper 
sudo chmod -R 777 /tmp/zookeeper1
sudo chmod -R 777 /tmp/zookeeper2

One of the issues was that the default zookeeper that comes with Kafka has no log that shows the write error. Once I did this command (based on GiorgosMyrianthous comment):

journalctl -u zookeeper.service

I could clearly see the error and fix the problem.

Upvotes: 0

Giorgos Myrianthous
Giorgos Myrianthous

Reputation: 39810

[2020-05-17 07:02:35,248] ERROR Unable to access datadir, exiting abnormally (org.apache.zookeeper.server.quorum.QuorumPeerMain) 
org.apache.zookeeper.server.persistence.FileTxnSnapLog$DatadirException: Cannot write to data directory /tmp/zookeeper1/version-2 

I suspect that the user melly-dev2 does not have access to write logs under /tmp/zookeeper/.

Also, make sure to change dataDir to a permanent location (i.e. not under /tmp/) as everything will be lost once your machine turns off.

Upvotes: 1

Related Questions