Kaliyug Antagonist
Kaliyug Antagonist

Reputation: 3612

HBase distributed mode

I'm trying to run HBase(0.94.11) in distributed mode on 3-node Hadoop(1.0.4) cluster but I wish to utilize only two nodes for HBase.

Master/Namenode : cldx-1230-1116( IP :
Regionserver/Slave : cldx-1229-1117(IP :

HBase is getting started but there is no regionserver reflected. In the logs, following errors are shown :

Master/namenode log :

2013-09-03 14:52:23,683 DEBUG org.apache.hadoop.hbase.master.HMaster: Started service threads
2013-09-03 14:52:23,684 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 0 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
2013-09-03 14:52:24,587 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString= sessionTimeout=180000 watcher=hconnection
2013-09-03 14:52:24,607 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of this process is 31222@cldx-1230-1116
2013-09-03 14:52:24,610 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server slave/ Will not attempt to authenticate using SASL (unknown error)
2013-09-03 14:52:24,615 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to slave/, initiating session
2013-09-03 14:52:24,631 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server slave/, sessionid = 0x140e363f8090002, negotiated timeout = 180000
2013-09-03 14:52:25,230 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 1546 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
2013-09-03 14:52:26,753 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 3068 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
2013-09-03 14:52:28,266 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 4582 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.

regionserver/slave log :

2013-09-03 16:05:18,307 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString= sessionTimeout=180000 watcher=regionserver:60020
2013-09-03 16:05:18,333 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/ Will not attempt to authenticate using SASL (unknown error)
2013-09-03 16:05:18,336 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of this process is 14384@cldx-1229-1117
2013-09-03 16:05:18,348 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to localhost/, initiating session
2013-09-03 16:05:18,426 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server localhost/, sessionid = 0x140e363f8090000, negotiated timeout = 180000
2013-09-03 16:05:18,452 DEBUG org.apache.hadoop.hbase.catalog.CatalogTracker: Starting catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker@3a9cfedf
2013-09-03 16:05:18,517 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node /hbase/online-snapshot/acquired already exists and this is not a retry
2013-09-03 16:05:18,557 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: globalMemStoreLimit=393.4m, globalMemStoreLimitLowMark=344.2m, maxHeap=983.4m
2013-09-03 16:05:18,561 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 2hrs, 46mins, 40sec
2013-09-03 16:05:18,621 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at localhost,60000,1378199761324
2013-09-03 16:05:28,697 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to master. Retrying. Error was:
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:692)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
    at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:390)
    at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:436)
    at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1127)
    at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
    at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86)
    at com.sun.proxy.$Proxy8.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138)
    at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:2030)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2076)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:744)
    at java.lang.Thread.run(Thread.java:722)

slave's zookeeper log :

2013-09-03 16:05:18,345 INFO org.apache.zookeeper.server.NIOServerCnxnFactory: Accepted socket connection from /
2013-09-03 16:05:18,392 INFO org.apache.zookeeper.server.ZooKeeperServer: Client attempting to establish new session at /
2013-09-03 16:05:18,395 INFO org.apache.zookeeper.server.persistence.FileTxnLog: Creating new log file: log.5a
2013-09-03 16:05:18,422 INFO org.apache.zookeeper.server.ZooKeeperServer: Established session 0x140e363f8090000 with negotiated timeout 180000 for client /
2013-09-03 16:05:18,508 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x140e363f8090000 type:create cxid:0x8 zxid:0x5b txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot/acquired Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired
2013-09-03 16:05:33,933 INFO org.apache.zookeeper.server.NIOServerCnxnFactory: Accepted socket connection from /
2013-09-03 16:05:33,972 INFO org.apache.zookeeper.server.ZooKeeperServer: Client attempting to establish new session at /
2013-09-03 16:05:33,975 INFO org.apache.zookeeper.server.ZooKeeperServer: Established session 0x140e363f8090001 with negotiated timeout 180000 for client /
2013-09-03 16:05:42,358 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x140e363f8090001 type:create cxid:0xb zxid:0x5d txntype:-1 reqpath:n/a Error Path:/hbase/master Error:KeeperErrorCode = NodeExists for /hbase/master
2013-09-03 16:05:47,934 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x140e363f8090001 type:create cxid:0x1f zxid:0x63 txntype:-1 reqpath:n/a Error Path:/hbase/online-snapshot/acquired Error:KeeperErrorCode = NodeExists for /hbase/online-snapshot/acquired
2013-09-03 16:05:49,037 INFO org.apache.zookeeper.server.NIOServerCnxnFactory: Accepted socket connection from /
2013-09-03 16:05:49,042 INFO org.apache.zookeeper.server.ZooKeeperServer: Client attempting to establish new session at /
2013-09-03 16:05:49,050 INFO org.apache.zookeeper.server.ZooKeeperServer: Established session 0x140e363f8090002 with negotiated timeout 180000 for client /
2013-09-03 16:08:15,001 INFO org.apache.zookeeper.server.ZooKeeperServer: Expiring session 0x140e35e60460000, timeout of 180000ms exceeded
2013-09-03 16:08:15,001 INFO org.apache.zookeeper.server.ZooKeeperServer: Expiring session 0x140d02920860000, timeout of 180000ms exceeded
2013-09-03 16:08:15,002 INFO org.apache.zookeeper.server.ZooKeeperServer: Expiring session 0x140e35e60460001, timeout of 180000ms exceeded
2013-09-03 16:08:15,002 INFO org.apache.zookeeper.server.PrepRequestProcessor: Processed session termination for sessionid: 0x140e35e60460000
2013-09-03 16:08:15,002 INFO org.apache.zookeeper.server.PrepRequestProcessor: Processed session termination for sessionid: 0x140d02920860000
2013-09-03 16:08:15,002 INFO org.apache.zookeeper.server.PrepRequestProcessor: Processed session termination for sessionid: 0x140e35e60460001

regionservers file has only one entry viz.


  <description>The directory shared by RegionServers.</description>

  <description>The mode the cluster will be in. Possible values are
      false: standalone and pseudo-distributed setups with managed Zookeeper
      true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)




  1. The Hadoop masters file on the namenode( has
  2. The Hadoop slaves file on the namenode(, and
  3. The Hadoop masters file on the slave( has
  4. The Hadoop slaves file on the slave( has

hosts file on master :

#      localhost
#   localhost   cldx-1230-1116    cloudx master   slave
# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

hosts file on slave :

#      localhost
#     localhost     cldx-1229-1117 cldx-1229-1117     cldx-1230-1116 cldx-1230-1116    cloudx master   slave
# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

I'm clueless as to why the regionserver/slave is trying to connect to the master on the localhost rather than !

Upvotes: 0

Views: 5337

Answers (1)


Reputation: 34184

Add the IP and hostname of HMaster into the /etc/hosts file of RS and restart HBase daemons. One possible reason could be that your HMaster is assuming that the RS has the IP of implies localhost) and hence resolves to its own localhost.

And yes, JD is absolutely correct. hbase.master is an extinct property now.

Upvotes: 1

Related Questions