Datanodes are active but I'm not able to copy files to HDFS [Hadoop 2.6.0 - Raspberry Pi Cluster]

Question

I have been working on Hadoop Cluster using Raspberry Pis, just for learning purposes. I have configured all the Slaves and a Master successfully (As far as i think).

Problem: HDFS is not able to copy local files. And According to http://Master:8088 i have 3 active nodes. (I attached a screenshot at end)

But when i try to copy a Local file to HDFS, i get below exception:

16/01/12 06:20:43 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /LICENCE.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)

        at org.apache.hadoop.ipc.Client.call(Client.java:1468)
        at org.apache.hadoop.ipc.Client.call(Client.java:1399)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:483)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1532)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1349)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
put: File /LICENCE.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.

Below are my configurations on Slaves and Master:

core-site.xml:



fs.default.name
hdfs://Master:9000

mapred-site.xml



      mapreduce.framework.name
      yarn


        mapreduce.job.tracker
        Master:5431
>

hdfs-site.xml


 
      dfs.namenode.name.dir
      file:/usr/local/hadoop_tmp/hdfs/namenode
 
 
      dfs.datanode.data.dir
      file:/usr/local/hadoop_tmp/hdfs/datanode
 
 
  dfs.replication
  4

yarn-site.xml



      yarn.nodemanager.aux-services
      mapreduce_shuffle


      yarn.nodemanager.aux-services.mapreduce.shuffle.class
      org.apache.hadoop.mapred.ShuffleHandler


        yarn.resourcemanager.resource-tracker.address
        Master:8025


        yarn.resourcemanager.scheduler.address
        Master:8035


        yarn.resourcemanager.address
        Master:8050

When i run JPS on Master:

1218 ResourceManager
2147 Jps
1034 SecondaryNameNode
879 NameNode

When i run JPS on Slaves:

1270 Jps
1118 NodeManager

I would really be thankful to you all, Please help me through it. I searched on StackoverFlow tried many things but couldn't fix it. Deleted temporary directories, formatted namenode and datanode already.

IF you require anything else for debugging purposes, I'll be right here.

Thanks a lot!

Regards, Maher Shahmeer

Thanga · Accepted Answer

Your hdfs-site.xml should be different for slaves and master.

Name node setting has to be


 
  dfs.namenode.name.dir
  file:/usr/local/hadoop_tmp/hdfs/namenode
 

  dfs.replication
  4

And All the slave nodes should have the below setting


 
  dfs.datanode.data.dir
  file:/usr/local/hadoop_tmp/hdfs/datanode
 
 
  dfs.replication
  4

And your slaves configuration file should include all IPs of your data nodes.

Finally stop and restart your hadoop cluster. The issue will be fixed. Best of luck.

Datanodes are active but I'm not able to copy files to HDFS [Hadoop 2.6.0 - Raspberry Pi Cluster]

Answers (2)

Related Questions

Datanodes are active but I&#39;m not able to copy files to HDFS [Hadoop 2.6.0 - Raspberry Pi Cluster]

Answers (2)

Related Questions

Datanodes are active but I'm not able to copy files to HDFS [Hadoop 2.6.0 - Raspberry Pi Cluster]