user07
user07

Reputation: 670

Region server is not running on hadoop 2.0 distributed cluster

While starting hbase cluster i was facing below error

  2015-05-15 16:58:31,741 WARN  [regionserver60020-    SendThread(hbasenamenode:2181)] zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
  java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
    at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
  2015-05-15 16:58:32,843 INFO  [regionserver60020-SendThread(hbasenamenode:2181)] zookeeper.ClientCnxn: Opening socket connection to server hbasenamenode/172.17.198.59:2181. Will not attempt to authenticate using SASL (unknown error)
     2015-05-15 16:58:32,847 WARN  [regionserver60020-SendThread(hbasenamenode:2181)] zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
   java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
    at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
   2015-05-15 16:58:33,752 INFO  [regionserver60020] ipc.RpcServer: Stopping server on 60020
   2015-05-15 16:58:33,755 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server demodatanode2clone2,60020,1431689290504: Initialization of RS failed.  Hence aborting RS.
  java.io.IOException: Received the shutdown message while waiting.
    at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:783)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:730)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:702)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:837)
    at java.lang.Thread.run(Thread.java:744)
  2015-05-15 16:58:33,756 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: []
 2015-05-15 16:58:33,767 INFO  [regionserver60020] ipc.RpcServer: Stopping server on 60020
 2015-05-15 16:58:33,767 INFO  [regionserver60020] regionserver.HRegionServer: Stopping infoServer
 2015-05-15 16:58:33,845 INFO  [regionserver60020] mortbay.log: Stopped [email protected]:60030
  2015-05-15 16:58:33,949 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting
 java.lang.RuntimeException: HRegionServer Aborted
    at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:66)
    at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:85)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2410)
   2015-05-15 16:58:33,951 INFO  [regionserver60020-SendThread(hbasenamenode:2181)] zookeeper.ClientCnxn: Opening socket connection to server hbasenamenode/172.17.198.59:2181. Will not attempt to authenticate using SASL (unknown error)
  2015-05-15 16:58:33,953 WARN  [regionserver60020-SendThread(hbasenamenode:2181)] zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
 java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
    at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
    2015-05-15 16:58:33,959 INFO  [Thread-9] regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@36d87f9e
    2015-05-15 16:58:33,972 INFO  [Thread-9] regionserver.ShutdownHook: Starting fs shutdown hook thread.
    2015-05-15 16:58:33,983 INFO  [Thread-9] regionserver.ShutdownHook: Shutdown hook finished.

But while looking for the error i got one solution from some site as run this command

  bin/hbase zkcli

and it worked but i am not getting what this command does? Can any one pls tell me why actually i was facing issue earlier and how above resolved it?

Upvotes: 0

Views: 3755

Answers (1)

Rajesh N
Rajesh N

Reputation: 2574

Add this property in hbase-site.xml:

<property>
        <name>hbase.zookeeper.property.maxClientCnxns</name>
        <value>1000</value>
 </property>

This property increases the maximum number of client connections.

Default value is 300. Change it to 1000 to avoid zookeeper ConnectionLoss errors. Also add hbase.zookeeper.quorum and hbase.zookeeper.property.clientPort properties in slave node's hbase-site.xml too.

NOTE: Add this property on both master and slave node. Restart your HBase.

UPDATE:

Change your hbase-site.xml (in both master and slave node) as follows:

<configuration>
    <property>
        <name>hbase.master</name>
        <value>master:60000</value>
    </property>
    <property>
        <name>hbase.rootdir</name>
        <value>hdfs://NN:PortNo/hbase</value>
    </property>
    <property>
        <name>hbase.zookeeper.property.clientPort</name>
        <value>2181</value>
    </property>
    <property>
        <name>hbase.zookeeper.quorum</name>
        <value>NN,DN</value>
    </property>
    <property>
            <name>hbase.cluster.distributed</name>
            <value>true</value>
    </property>
    <property>
            <name>hbase.zookeeper.property.maxClientCnxns</name>
            <value>1000</value>
     </property>
</configuration>

I have same hbase-site.xml on all nodes. But you have different files on both master and slave node. This may be a problem in future. Try to maintain all the hbase-site.xml files similar.

UPDATE II:

  1. Delete one entry for demonamenodeclone2 in master's regionservers file. Your regionserver file in master should contain only two lines with one line for master-hostname and other line for slave-hostname.

  2. Regionserver file on slave node should be same as that on master. But you have only localhost in it. Change it to contain two lines as in master's regionserver file.

  3. You are missing a </property> in hbase-site.xml for hbase.zookeeper.property.clientPort. Change it in slave node too.

Upvotes: 1

Related Questions