Sergey
Sergey

Reputation: 15

Apache Zookeeper multi-node communication error

I have three znode: Apache Zookeeper, 3.4.8 JAVA 1.8_77 RedHat 6.7, Selinux disabled, Firewall disabled, IPV6 disabled

Hosts:

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4

::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.0.133 v175
192.168.0.134 v176
192.168.0.125 V177

Config:

tickTime=2000
dataDir=/home/znode/datadir
clientPort=2181
initLimit=5
syncLimit=2
server.1=v175:2888:3888
server.2=v176:2888:3888
server.3=v177:2888:3888

Errors:

essage format version), 2 (n.leader), 0x0 (n.zxid), 0x100f (n.round), LOOKING (n.state), 1 (n.sid), 0x0 (n.peerEpoch) LEADING (my state)
2016-04-05 16:26:00,270 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@600] - Notification: 1 (message format version), 2 (n.leader), 0x0 (n.zxid), 0x100f (n.round), LOOKING (n.state), 1 (n.sid), 0x0 (n.peerEpoch) LEADING (my state)
2016-04-05 16:26:03,099 [myid:3] - WARN  [QuorumPeer[myid=3]/0.0.0.0:2181:QuorumPeer@862] - Unexpected exception
java.lang.InterruptedException: Timeout while waiting for epoch from quorum
        at org.apache.zookeeper.server.quorum.Leader.getEpochToPropose(Leader.java:881)
        at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:380)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:859)
2016-04-05 16:26:03,100 [myid:3] - INFO  [QuorumPeer[myid=3]/0.0.0.0:2181:Leader@496] - Shutting down
2016-04-05 16:26:03,100 [myid:3] - INFO  [QuorumPeer[myid=3]/0.0.0.0:2181:Leader@502] - Shutdown called
java.lang.Exception: shutdown Leader! reason: Forcing shutdown
        at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:502)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:865)
2016-04-05 16:26:03,100 [myid:3] - INFO  [QuorumPeer[myid=3]/0.0.0.0:2181:QuorumPeer@774] - LOOKING
2016-04-05 16:26:03,100 [myid:3] - INFO  [LearnerCnxAcceptor-V177/192.168.0.125:2888:Leader$LearnerCnxAcceptor@325] - exception while shutting down acceptor: java.net.SocketException: Socket closed
2016-04-05 16:26:03,100 [myid:3] - INFO  [QuorumPeer[myid=3]/0.0.0.0:2181:FastLeaderElection@818] - New election. My id =  3, proposed zxid=0x0
2016-04-05 16:26:03,102 [myid:3] - WARN  [WorkerSender[myid=3]:QuorumCnxManager@400] - Cannot open channel to 2 at election address v176/192.168.0.134:3888
java.net.NoRouteToHostException: No route to host
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:381)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:354)
        at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:452)
        at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:433)
        at java.lang.Thread.run(Thread.java:745)
2016-04-05 16:26:03,103 [myid:3] - INFO  [WorkerSender[myid=3]:QuorumPeer$QuorumServer@149] - Resolved hostname: v176 to address: v176/192.168.0.134
2016-04-05 16:26:03,102 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@600] - Notification: 1 (message format version), 3 (n.leader), 0x0 (n.zxid), 0x100f (n.round), LOOKING (n.state), 1 (n.sid), 0x0 (n.peerEpoch) LOOKING (my state)

Servers cannot communicate with each other. Help !

Upvotes: 1

Views: 1593

Answers (2)

roguequery
roguequery

Reputation: 974

if you netstat -tulnap the server, are ports 2888, 3888 ad 2181 open and listening or can only localhost (0.0.0.0) on each of those boxes hit 2181, 288, and 3888?

EDIT:

Looking at your netstat I see this:

tcp 0 0 192.168.0.125:2888 0.0.0.0:* LISTEN
tcp 0 0 192.168.0.125:3888

means you need to modify your etc/hots to use 0.0.0.0 with the hostname.

So if the box's hostname is zoobox1 the /etc/hosts needs to have this line in it:

127.0.0.1 localhost
0.0.0.0 zoobox1

this way the netstat -tulnap will open port 2888 and 3888 so other servers then localhost can connect.

Upvotes: 1

Sergey
Sergey

Reputation: 15

V175 - doesn't have a listener on 2888

 [root@v175 ~]# netstat -tulnap
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name
tcp        0      0 0.0.0.0:2181                0.0.0.0:*                   LISTEN      3201/java
tcp        0      0 0.0.0.0:45423               0.0.0.0:*                   LISTEN      3201/java
tcp        0      0 0.0.0.0:111                 0.0.0.0:*                   LISTEN      1236/rpcbind
tcp        0      0 192.168.0.133:3888          0.0.0.0:*                   LISTEN      3201/java
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      1486/sshd
tcp        0      0 127.0.0.1:631               0.0.0.0:*                   LISTEN      1319/cupsd
tcp        0      0 127.0.0.1:25                0.0.0.0:*                   LISTEN      1577/master
tcp        0      0 0.0.0.0:54618               0.0.0.0:*                   LISTEN      1260/rpc.statd
tcp        0      0 192.168.0.133:3888          192.168.0.134:42183         ESTABLISHED 3201/java
tcp        0      0 192.168.0.133:22            10.206.171.250:50630        ESTABLISHED 4838/sshd
tcp        0      0 192.168.0.133:3888          192.168.0.125:58200         ESTABLISHED 3201/java
udp        0      0 0.0.0.0:983                 0.0.0.0:*                               1236/rpcbind
udp        0      0 0.0.0.0:52328               0.0.0.0:*                               1260/rpc.statd
udp        0      0 0.0.0.0:111                 0.0.0.0:*                               1236/rpcbind
udp        0      0 127.0.0.1:1012              0.0.0.0:*                               1260/rpc.statd
udp        0      0 0.0.0.0:631                 0.0.0.0:*                               1319/cupsd

V176

[root@v176 ~]# netstat -antup
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name
tcp        0      0 0.0.0.0:2181                0.0.0.0:*                   LISTEN      5553/java
tcp        0      0 192.168.0.134:2888          0.0.0.0:*                   LISTEN      5553/java
tcp        0      0 0.0.0.0:60845               0.0.0.0:*                   LISTEN      1263/rpc.statd
tcp        0      0 0.0.0.0:111                 0.0.0.0:*                   LISTEN      1239/rpcbind
tcp        0      0 192.168.0.134:3888          0.0.0.0:*                   LISTEN      5553/java
tcp        0      0 0.0.0.0:38485               0.0.0.0:*                   LISTEN      5553/java
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      1487/sshd
tcp        0      0 127.0.0.1:631               0.0.0.0:*                   LISTEN      1322/cupsd
tcp        0      0 127.0.0.1:25                0.0.0.0:*                   LISTEN      1578/master
tcp        0     64 192.168.0.134:22            10.206.171.250:50927        ESTABLISHED 10784/sshd
tcp        0      0 192.168.0.134:49506         192.168.0.133:3888          ESTABLISHED 5553/java
udp        0      0 0.0.0.0:38979               0.0.0.0:*                               1263/rpc.statd
udp        0      0 0.0.0.0:985                 0.0.0.0:*                               1239/rpcbind
udp        0      0 0.0.0.0:111                 0.0.0.0:*                               1239/rpcbind
udp        0      0 0.0.0.0:631                 0.0.0.0:*                               1322/cupsd
udp        0      0 127.0.0.1:1015              0.0.0.0:*                               1263/rpc.statd

V177

[root@v177 ~]# netstat -tulnap
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name
tcp        0      0 0.0.0.0:2181                0.0.0.0:*                   LISTEN      5547/java
tcp        0      0 192.168.0.125:2888          0.0.0.0:*                   LISTEN      5547/java
tcp        0      0 0.0.0.0:40904               0.0.0.0:*                   LISTEN      5547/java
tcp        0      0 0.0.0.0:111                 0.0.0.0:*                   LISTEN      1245/rpcbind
tcp        0      0 192.168.0.125:3888          0.0.0.0:*                   LISTEN      5547/java
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      1494/sshd
tcp        0      0 0.0.0.0:40694               0.0.0.0:*                   LISTEN      1269/rpc.statd
tcp        0      0 127.0.0.1:631               0.0.0.0:*                   LISTEN      1328/cupsd
tcp        0      0 127.0.0.1:25                0.0.0.0:*                   LISTEN      1585/master
tcp        0      0 192.168.0.125:22            10.206.171.250:50933        ESTABLISHED 10771/sshd
tcp        0      0 192.168.0.125:58200         192.168.0.133:3888          ESTABLISHED 5547/java
udp        0      0 127.0.0.1:1023              0.0.0.0:*                               1269/rpc.statd
udp        0      0 0.0.0.0:992                 0.0.0.0:*                               1245/rpcbind
udp        0      0 0.0.0.0:59619               0.0.0.0:*                               1269/rpc.statd
udp        0      0 0.0.0.0:111                 0.0.0.0:*                               1245/rpcbind
udp        0      0 0.0.0.0:631                 0.0.0.0:*                               1328/cupsd

Upvotes: 0

Related Questions