Reputation: 21
I set up a hadoop cluster of two nodes on Amazon EC2. It works well. I can upload data to HDFS from master node or other instances in the same Amazon zone as the hadoop cluster by using hadoop api (java program is attached).
However, when I want to do this from my local non-hadoop machine, it turns out with exceptions as below:
I then login to the hadoop namenode to check with command line. The folder "testdir" is created, but the size of the uploaded file "myfile" is 0.
==================this is separator===============================
These are the exceptions
Apr 18, 2013 10:40:47 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream createBlockOutputStream
INFO: Exception in createBlockOutputStream 10.196.153.215:50010 java.net.ConnectException: Connection timed out
Apr 18, 2013 10:40:47 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream nextBlockOutputStream
INFO: Abandoning block blk_560654195674249927_1002
Apr 18, 2013 10:40:47 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream nextBlockOutputStream
INFO: Excluding datanode 10.196.153.215:50010
Apr 18, 2013 10:41:09 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream createBlockOutputStream
INFO: Exception in createBlockOutputStream 10.195.171.154:50010 java.net.ConnectException: Connection timed out
Apr 18, 2013 10:41:09 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream nextBlockOutputStream
INFO: Abandoning block blk_1747509888999401559_1002
Apr 18, 2013 10:41:10 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream nextBlockOutputStream
INFO: Excluding datanode 10.195.171.154:50010
Apr 18, 2013 10:41:10 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer run
WARNING: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/ubuntu/testdir/myfile could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
at org.apache.hadoop.ipc.Client.call(Client.java:1070)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy1.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy1.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3510)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3373)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2589)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2829)
Apr 18, 2013 10:41:10 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream processDatanodeError
WARNING: Error Recovery for block blk_1747509888999401559_1002 bad datanode[0] nodes == null
Apr 18, 2013 10:41:10 AM org.apache.hadoop.hdfs.DFSClient$DFSOutputStream processDatanodeError
WARNING: Could not get block locations. Source file "/user/ubuntu/testdir/myfile" - Aborting...
Exception in thread "main" org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/ubuntu/testdir/myfile could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
at org.apache.hadoop.ipc.Client.call(Client.java:1070)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy1.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy1.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3510)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3373)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2589)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2829)
==================this is separator===============================
Here is my java codes
Path output = new Path("testdir");
Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://ec2-23-22-12-173.compute-1.amazonaws.com:9000");
conf.set("hadoop.job.user",ubuntu);
FileSystem.mkdirs(FileSystem.get(conf), output, FsPermission.valueOf("drwxr-xr-x"));
FileSystem fs = FileSystem.get(conf);
fs.copyFromLocalFile(new Path("./myfile"), output);
==================this is separator=============================== PS. I have open port 9000, 50010 in the security group and turned off the linux firewall already.
Anyone has any thoughts?
Thanks.
Upvotes: 2
Views: 2290
Reputation: 1
try using hdfsClient with a python script or use httpFS. Need to change the configuration in hdfs_site.xml or core_site.xml
Upvotes: 0
Reputation: 162
Did you find any answer to this problem ... if not, here is the potential "REASON" ==> your client is trying to access the data nodes from their private IP address on EC2 (which is only visible to the cluster) not the public IP. you can verify that looking at your error log: excluding datanode private ip not the public ip, but I have no clue how shall we overcome that. I have the same issue. for more information check this link: http://www.hadoopinrealworld.com/could-only-be-replicated-to-0-nodes/
Upvotes: 1
Reputation: 34184
There could be several reasons behind this error : 1- DataNodes are not up and running. Make sure it is not the case. If you don't get anything try to dig the DN logs on each server.
2- The space on the machines where DNs are running is lesser than the space specified by you through "dfs.datanode.du.reserved" property.
3- There is actually no space left on your DN machines.
4- The path specified by "dfs.data.dir" in your hdfs-site.xml file has no space left.(Perhaps the disk serving as dfs.data.dir has run out of space).
5- DNs are not able to send heartbeats/block-reports to the NN. Make sure there is no network related issue.
HTH
Upvotes: 1