Reputation: 33
Running HBase 2.0.4 with Hadoop 2.8.5 on Centos 7, with 1 Master node, 4 Slave nodes. I've tried the same setup with HBase 2.1.3, and the same problem occurs.
The HMaster fails to start due to Zookeeper not resolving HRegionservers, as seen from this error log.
2019-03-29 13:58:34,961 INFO [main] zookeeper.ZooKeeper: Initiating client connection, connectString=node0.ken:2181, node1.ken:2181, node2.ken:2181, node3.ken:2181, node4.ken:21
81 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.PendingWatcher@165e389b
2019-03-29 13:58:34,965 WARN [main] zookeeper.RecoverableZooKeeper: Unable to create ZooKeeper Connection
java.net.UnknownHostException: node1.ken: Name or service not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
at java.net.InetAddress.getAllByName(InetAddress.java:1193)
at java.net.InetAddress.getAllByName(InetAddress.java:1127)
at org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61)
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445)
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:380)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.checkZk(RecoverableZooKeeper.java:131)
My config files look as follows:
---- hbase-site.xml ----
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://node0.ken:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>zookeeper.session.timeout</name>
<value>1200000</value>
</property>
<property>
<name>hbase.zookeeper.session.timeout</name>
<value>1200000</value>
</property>
<property>
<name>hbase.zookeeper.property.tickTime</name>
<value>6000</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>hdfs://node0.ken:9000/zookeeper</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>node0.ken, node1.ken, node2.ken, node3.ken, node4.ken</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/node0.ken</value>
</property>
</configuration>
---- regionservers ----
node1.ken
node2.ken
node3.ken
node4.ken
---- /etc/hosts ----
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.158.57.150 node0.ken node0 master-node
10.158.57.151 node1.ken node1
10.158.57.152 node2.ken node2
10.158.57.153 node3.ken node3
10.158.57.154 node4.ken node4
All hosts are pingable from each other, SELinux and firewalld are disabled, I am able to successfully telnet node1:2181
from all the other nodes, and I've already tried the steps suggested here, but Zookeeper still fails to resolve: https://jayunit100.blogspot.com/2013/05/debugging-hbase-installation.html
Am I missing something? Where else does Zookeeper pull its host resolution from?
UPDATE: 2019-03-29
The problem seems to be the Zookeeper client that HBase uses (zookeeper.version=3.4.10), and might be related to this bug: https://issues.apache.org/jira/browse/ZOOKEEPER-2982 Does anyone know how to replace the Zookeeper client HBase uses with a more updated one?
UPDATE: 2019-04-01
I tried replacing hbase/lib/zookeeper-3.4.10.jar
with hbase/lib/zookeeper-3.4.13.jar
, outputs the same error, just from a different API call:
2019-03-29 19:09:46,880 INFO [main] zookeeper.ZooKeeper: Initiating client connection, connectString=node0.ken:2181, node1.ken:2181, node2.ken:2181, node3.ken:2181, node4.ken:2181 sessionTimeout=1200000 watcher=org.apache.hadoop.hbase.zookeeper.PendingWatcher@336880df
2019-03-29 19:09:46,912 INFO [main-SendThread( node1.ken:2181)] zookeeper.ClientCnxn: Opening socket connection to server node1.ken:2181. Will not attempt to authenticate using SASL (unknown error)
2019-03-29 19:09:46,917 WARN [main-SendThread( node1.ken:2181)] zookeeper.ClientCnxn: Session 0x0 for server node1.ken:2181, unexpected error, closing socket connection and attempting reconnect
java.nio.channels.UnresolvedAddressException
at sun.nio.ch.Net.checkAddress(Net.java:101)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622)
at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:277)
at org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:287)
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1021)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1064)
I tried compiling a small Java class to test these functions:
import java.net.InetAddress;
import java.net.InetSocketAddress;
import java.net.SocketAddress;
import java.net.Socket;
import sun.nio.ch.Net;
import java.net.UnknownHostException;
import java.io.IOException;
import java.util.Arrays;
import java.util.List;
public class GetIPHostname {
public static void main(String[] args) {
InetAddress ip;
String hostname;
try {
ip = InetAddress.getLocalHost();
hostname = ip.getHostName();
System.out.println("Your current IP address : " + ip);
System.out.println("Your current Hostname : " + hostname);
//List<String> hostname_list = Arrays.asList("node0", "node1", "node2", "node3", "node4");
List<String> hostname_list = Arrays.asList("node0.ken", "node1.ken", "node2.ken", "node3.ken", "node4.ken");
for (String cur_hostname : hostname_list) {
String ip_address = InetAddress.getByName(cur_hostname).getHostAddress();
System.out.println("Hostname resolved: "+cur_hostname+" -> "+ip_address);
final Socket socket = new Socket();
SocketAddress address = new InetSocketAddress(cur_hostname, 2181);
try {
InetSocketAddress isa = Net.checkAddress(address);
System.out.println("ISA: " +isa.getAddress()+ " -> " +isa.getPort());
InetAddress[] iadresses= InetAddress.getAllByName(cur_hostname);
for (InetAddress cur_ia : iadresses) {
System.out.println("InetAddress: " + cur_ia.getHostAddress());
}
socket.connect(address);
socket.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}// To connect to remote host
}
} catch (UnknownHostException e) {
e.printStackTrace();
}
}
}
... and both APIs are able to resolve the addresses using the hosts file and even connect to the Zookeeper servers at port 2181:
[root@node1 test]# java GetIPHostname
Your current IP address : node1.ken/10.158.57.151
Your current Hostname : node1.ken
Hostname resolved: node0.ken -> 10.158.57.150
ISA: node0.ken/10.158.57.150 -> 2181
InetAddress: 10.158.57.150
Hostname resolved: node1.ken -> 10.158.57.151
ISA: node1.ken/10.158.57.151 -> 2181
InetAddress: 10.158.57.151
Hostname resolved: node2.ken -> 10.158.57.152
ISA: node2.ken/10.158.57.152 -> 2181
InetAddress: 10.158.57.152
Hostname resolved: node3.ken -> 10.158.57.153
ISA: node3.ken/10.158.57.153 -> 2181
InetAddress: 10.158.57.153
Hostname resolved: node4.ken -> 10.158.57.154
ISA: node4.ken/10.158.57.154 -> 2181
InetAddress: 10.158.57.154
Upvotes: 2
Views: 1094
Reputation: 11
<property>
<name>hbase.zookeeper.quorum</name>
<value>node0.ken, node1.ken, node2.ken, node3.ken, node4.ken</value>
</property>
try to removing the spacing of node0.ken, node1.ken,...
to
<property>
<name>hbase.zookeeper.quorum</name>
<value>node0.ken,node1.ken,node2.ken,node3.ken,node4.ken</value>
</property>
Upvotes: 1