Reputation: 423
I have hadoop 2.7.3 and hbase 1.2.3. I tried to run hbase in Pseudo-Distributed mode following the official document. I have only one machine. HDFS is working well.
However the problem is when I executed start-hbase.sh, the HRegionServer startup but exit automatically. I can see the HMaster and HQuorumPeer are still there.
From the log, I think the hbase created a file /hbase/WALs/ubuntuserver,16201,1478832152563/ubuntuserver%2C16201%2C1478832152563..meta.1478832162907.meta but have no permission to append content into the file. Current user is 'ubuntuserver' in group 'root'. and I have changed all the folders on hdfs to 'ubuntuserver' and 'root'. And did 'hdfs dfs -chmod -R 777 /'. And restart linux, hdfs and then hbase. It doesn't work. Every time HResionServer started it would create a new file then have no permission to append to it.
Here is the log of HRegionServer. How to fix this?
2016-11-11 11:13:44,774 INFO [RS_OPEN_META-ubuntuServer:16201-0-MetaLogRoller] regionserver.HRegionServer: STOPPED: Failed log close in log roller 2016-11-11 11:13:44,774 INFO [regionserver/ubuntuServer/10.0.2.15:16201] regionserver.SplitLogWorker: Sending interrupt to stop the worker thread 2016-11-11 11:13:44,775 INFO [regionserver/ubuntuServer/10.0.2.15:16201] regionserver.HRegionServer: Stopping infoServer 2016-11-11 11:13:44,776 INFO [SplitLogWorker-ubuntuServer:16201] regionserver.SplitLogWorker: SplitLogWorker interrupted. Exiting. 2016-11-11 11:13:44,776 INFO [SplitLogWorker-ubuntuServer:16201] regionserver.SplitLogWorker: SplitLogWorker ubuntuserver,16201,1478834015515 exiting 2016-11-11 11:13:44,780 INFO [RS_OPEN_META-ubuntuServer:16201-0-MetaLogRoller] regionserver.LogRoller: LogRoller exiting. 2016-11-11 11:13:44,805 INFO [regionserver/ubuntuServer/10.0.2.15:16201] mortbay.log: Stopped [email protected]:16301 2016-11-11 11:13:44,810 INFO [regionserver/ubuntuServer/10.0.2.15:16201] regionserver.HeapMemoryManager: Stoping HeapMemoryTuner chore. 2016-11-11 11:13:44,810 INFO [regionserver/ubuntuServer/10.0.2.15:16201] flush.RegionServerFlushTableProcedureManager: Stopping region server flush procedure manager abruptly. 2016-11-11 11:13:44,810 INFO [regionserver/ubuntuServer/10.0.2.15:16201] snapshot.RegionServerSnapshotManager: Stopping RegionServerSnapshotManager abruptly. 2016-11-11 11:13:44,810 INFO [regionserver/ubuntuServer/10.0.2.15:16201] regionserver.HRegionServer: aborting server ubuntuserver,16201,1478834015515 2016-11-11 11:13:44,811 INFO [regionserver/ubuntuServer/10.0.2.15:16201] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x158516036c10005 2016-11-11 11:13:44,813 INFO [regionserver/ubuntuServer/10.0.2.15:16201-EventThread] zookeeper.ClientCnxn: EventThread shut down 2016-11-11 11:13:44,814 INFO [regionserver/ubuntuServer/10.0.2.15:16201] zookeeper.ZooKeeper: Session: 0x158516036c10005 closed 2016-11-11 11:13:44,814 INFO [regionserver/ubuntuServer/10.0.2.15:16201] regionserver.HRegionServer: stopping server ubuntuserver,16201,1478834015515; all regions closed. 2016-11-11 11:13:44,810 INFO [MemStoreFlusher.0] regionserver.MemStoreFlusher: MemStoreFlusher.0 exiting 2016-11-11 11:13:44,820 INFO [MemStoreFlusher.1] regionserver.MemStoreFlusher: MemStoreFlusher.1 exiting 2016-11-11 11:13:44,814 WARN [regionserver/ubuntuServer/10.0.2.15:16201] wal.ProtobufLogWriter: Failed to write trailer, non-fatal, continuing... java.nio.channels.ClosedChannelException at org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1538) at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:98) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58) at java.io.DataOutputStream.write(DataOutputStream.java:107) at com.google.protobuf.CodedOutputStream.refreshBuffer(CodedOutputStream.java:833) at com.google.protobuf.CodedOutputStream.flush(CodedOutputStream.java:843) at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:80) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.writeWALTrailer(ProtobufLogWriter.java:157) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.close(ProtobufLogWriter.java:130) at org.apache.hadoop.hbase.regionserver.wal.FSHLog.shutdown(FSHLog.java:1079) at org.apache.hadoop.hbase.wal.DefaultWALProvider.shutdown(DefaultWALProvider.java:114) at org.apache.hadoop.hbase.wal.WALFactory.shutdown(WALFactory.java:216) at org.apache.hadoop.hbase.regionserver.HRegionServer.shutdownWAL(HRegionServer.java:1315) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1064) at java.lang.Thread.run(Thread.java:745) 2016-11-11 11:13:44,829 INFO [regionserver/ubuntuServer/10.0.2.15:16201] regionserver.Leases: regionserver/ubuntuServer/10.0.2.15:16201 closing leases 2016-11-11 11:13:44,829 INFO [regionserver/ubuntuServer/10.0.2.15:16201] regionserver.Leases: regionserver/ubuntuServer/10.0.2.15:16201 closed leases 2016-11-11 11:13:44,830 INFO [regionserver/ubuntuServer/10.0.2.15:16201] hbase.ChoreService: Chore service for: ubuntuserver,16201,1478834015515 had [[ScheduledChore: Name: ubuntuserver,16201,1478834015515-MemstoreFlusherChore Period: 10000 Unit: MILLISECONDS], [ScheduledChore: Name: MovedRegionsCleaner for region ubuntuserver,16201,1478834015515 Period: 120000 Unit: MILLISECONDS]] on shutdown 2016-11-11 11:13:48,193 INFO [HBase-Metrics2-1] impl.MetricsSystemImpl: Stopping HBase metrics system... 2016-11-11 11:13:48,194 INFO [HBase-Metrics2-1] impl.MetricsSystemImpl: HBase metrics system stopped. 2016-11-11 11:13:48,695 INFO [HBase-Metrics2-1] impl.MetricsConfig: loaded properties from hadoop-metrics2-hbase.properties 2016-11-11 11:13:48,708 INFO [HBase-Metrics2-1] impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2016-11-11 11:13:48,708 INFO [HBase-Metrics2-1] impl.MetricsSystemImpl: HBase metrics system started 2016-11-11 11:13:50,001 INFO [regionserver/ubuntuServer/10.0.2.15:16201.logRoller] regionserver.LogRoller: LogRoller exiting. 2016-11-11 11:13:50,002 INFO [regionserver/ubuntuServer/10.0.2.15:16201] regionserver.CompactSplitThread: Waiting for Split Thread to finish... 2016-11-11 11:13:50,002 INFO [regionserver/ubuntuServer/10.0.2.15:16201] regionserver.CompactSplitThread: Waiting for Merge Thread to finish... 2016-11-11 11:13:50,002 INFO [regionserver/ubuntuServer/10.0.2.15:16201] regionserver.CompactSplitThread: Waiting for Large Compaction Thread to finish... 2016-11-11 11:13:50,002 INFO [regionserver/ubuntuServer/10.0.2.15:16201] regionserver.CompactSplitThread: Waiting for Small Compaction Thread to finish... 2016-11-11 11:13:50,012 INFO [regionserver/ubuntuServer/10.0.2.15:16201] ipc.RpcServer: Stopping server on 16201 2016-11-11 11:13:50,012 INFO [RpcServer.listener,port=16201] ipc.RpcServer: RpcServer.listener,port=16201: stopping 2016-11-11 11:13:50,017 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopped 2016-11-11 11:13:50,017 INFO [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopping 2016-11-11 11:13:50,016 INFO [regionserver/ubuntuServer/10.0.2.15:16201.leaseChecker] regionserver.Leases: regionserver/ubuntuServer/10.0.2.15:16201.leaseChecker closing leases 2016-11-11 11:13:50,026 INFO [regionserver/ubuntuServer/10.0.2.15:16201.leaseChecker] regionserver.Leases: regionserver/ubuntuServer/10.0.2.15:16201.leaseChecker closed leases 2016-11-11 11:13:50,028 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down 2016-11-11 11:13:50,029 INFO [regionserver/ubuntuServer/10.0.2.15:16201] zookeeper.ZooKeeper: Session: 0x158516036c10004 closed 2016-11-11 11:13:50,029 INFO [regionserver/ubuntuServer/10.0.2.15:16201] regionserver.HRegionServer: stopping server ubuntuserver,16201,1478834015515; zookeeper connection closed. 2016-11-11 11:13:50,029 INFO [regionserver/ubuntuServer/10.0.2.15:16201] regionserver.HRegionServer: regionserver/ubuntuServer/10.0.2.15:16201 exiting 2016-11-11 11:13:50,029 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting java.lang.RuntimeException: HRegionServer Aborted at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:68) at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:87) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2665) 2016-11-11 11:13:50,031 INFO [Thread-6] regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@18230356 2016-11-11 11:13:50,033 INFO [Thread-6] regionserver.ShutdownHook: Starting fs shutdown hook thread. 2016-11-11 11:13:50,036 ERROR [Thread-82] hdfs.DFSClient: Failed to close inode 16780 org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /hbase/WALs/ubuntuserver,16201,1478834015515/ubuntuserver%2C16201%2C1478834015515..meta.1478834024410.meta could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1571) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3107) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.hadoop.ipc.Client.call(Client.java:1411) at org.apache.hadoop.ipc.Client.call(Client.java:1364) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy16.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy16.addBlock(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279) at com.sun.proxy.$Proxy17.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
2016-11-11 11:13:50,057 INFO [Thread-6] regionserver.ShutdownHook: Shutdown hook finished.
Upvotes: 0
Views: 2165
Reputation: 423
oh, I have to answer myself's question one more time.
The root cause is not permission issue. It's storage space issue. I deployed the cluster in a virtual machine with 8G hard disk space. However it was 7.8G now.
Solution: Recreated a virtual machine with 200G hard disk space.
Upvotes: 3