\n Present Capacity: 48507940864 (45.18 GB)
\n DFS Remaining: 48507887616 (45.18 GB)
\n DFS Used: 53248 (52 KB)
\n DFS Used%: 0.00%
\n Under replicated blocks: 0
\n Blocks with corrupt replicas: 0
\n Missing blocks: 0
\n
\n Datanodes available: 1 (1 total, 0 dead)
\n
\n Live datanodes:
\n Name: 127.0.0.1:50010 (test.server)
\n Hostname: test.server
\n Decommission Status : Normal
\n Configured Capacity: 52844687360 (49.22 GB)
\n DFS Used: 53248 (52 KB)
\n Non DFS Used: 4336746496 (4.04 GB)
\n DFS Remaining: 48507887616 (45.18 GB)
\n DFS Used%: 0.00%
\n DFS Remaining%: 91.79%
\n Configured Cache Capacity: 0 (0 B)
\n Cache Used: 0 (0 B)
\n Cache Remaining: 0 (0 B)
\n Cache Used%: 100.00%
\n Cache Remaining%: 0.00%
\n Last contact: Fri Apr 25 22:16:56 PDT 2014
The client jars were copied directly from the hadoop install so no version mismatch there. I can browse the file system with my Java class and read file attributes. I just can’t read the file contents without getting the exception. If I try to write a file with the code:
\n\nFileSystem fs = null;\nBufferedWriter br = null;\n\nSystem.setProperty(\"HADOOP_USER_NAME\", \"root\");\n\ntry {\n fs = FileSystem.get(new Configuraion());\n\n //Path p = new Path(dir, file);\n Path p = new Path(\"hdfs://test.server:9000/usr/test/test.txt\");\n br = new BufferedWriter(new OutputStreamWriter(fs.create(p,true)));\n br.write(\"Hello World\");\n}\nfinally {\n if(br != null) br.close();\n if(fs != null) fs.close();\n}\n
\n\nthis creates the file but doesn’t write any bytes and throws the exception:
\n\nException in thread \"main\" org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /usr/test/test.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
\n\nGoogling for this indicated a possible space issue but from the dfsadmin report, it seems there is plenty of space. This is a plain vanilla install and I can’t get past this issue.
\n\nThe environment summary is:
\n\nSERVER:
\n\nHadoop 2.4.0 with pseudo-distribution (http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-common/SingleCluster.html)
\n\nCentOS 6.5 Virtual Machine 64 bit server\nJava 1.7.0_55
\n\nCLIENT:
\n\nWindows 8 (Virtual Machine)\nJava 1.7.0_51
\n\nAny help is greatly appreciated.
\n","author":{"@type":"Person","name":"user3574814"},"upvoteCount":6,"answerCount":3,"acceptedAnswer":null}}Reputation: 71
I’m having a bit of trouble with a simple Hadoop install. I’ve downloaded hadoop 2.4.0 and installed on a single CentOS Linux node (Virtual Machine). I’ve configured hadoop for a single node with pseudo distribution as described on the apache site (http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-common/SingleCluster.html). It starts with no issues in the logs and I can read + write files using the “hadoop fs” commands from the command line.
I’m attempting to read a file from the HDFS on a remote machine with the Java API. The machine can connect and list directory contents. It can also determine if a file exists with the code:
Path p=new Path("hdfs://test.server:9000/usr/test/test_file.txt");
FileSystem fs = FileSystem.get(new Configuration());
System.out.println(p.getName() + " exists: " + fs.exists(p));
The system prints “true” indicating it exists. However, when I attempt to read the file with:
BufferedReader br = null;
try {
Path p=new Path("hdfs://test.server:9000/usr/test/test_file.txt");
FileSystem fs = FileSystem.get(CONFIG);
System.out.println(p.getName() + " exists: " + fs.exists(p));
br=new BufferedReader(new InputStreamReader(fs.open(p)));
String line = br.readLine();
while (line != null) {
System.out.println(line);
line=br.readLine();
}
}
finally {
if(br != null) br.close();
}
this code throws the exception:
Exception in thread "main" org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-13917963-127.0.0.1-1398476189167:blk_1073741831_1007 file=/usr/test/test_file.txt
Googling gave some possible tips but all checked out. The data node is connected, active, and has enough space. The admin report from hdfs dfsadmin –report shows:
Configured Capacity: 52844687360 (49.22 GB)
Present Capacity: 48507940864 (45.18 GB)
DFS Remaining: 48507887616 (45.18 GB)
DFS Used: 53248 (52 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Datanodes available: 1 (1 total, 0 dead)
Live datanodes:
Name: 127.0.0.1:50010 (test.server)
Hostname: test.server
Decommission Status : Normal
Configured Capacity: 52844687360 (49.22 GB)
DFS Used: 53248 (52 KB)
Non DFS Used: 4336746496 (4.04 GB)
DFS Remaining: 48507887616 (45.18 GB)
DFS Used%: 0.00%
DFS Remaining%: 91.79%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Last contact: Fri Apr 25 22:16:56 PDT 2014
The client jars were copied directly from the hadoop install so no version mismatch there. I can browse the file system with my Java class and read file attributes. I just can’t read the file contents without getting the exception. If I try to write a file with the code:
FileSystem fs = null;
BufferedWriter br = null;
System.setProperty("HADOOP_USER_NAME", "root");
try {
fs = FileSystem.get(new Configuraion());
//Path p = new Path(dir, file);
Path p = new Path("hdfs://test.server:9000/usr/test/test.txt");
br = new BufferedWriter(new OutputStreamWriter(fs.create(p,true)));
br.write("Hello World");
}
finally {
if(br != null) br.close();
if(fs != null) fs.close();
}
this creates the file but doesn’t write any bytes and throws the exception:
Exception in thread "main" org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /usr/test/test.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
Googling for this indicated a possible space issue but from the dfsadmin report, it seems there is plenty of space. This is a plain vanilla install and I can’t get past this issue.
The environment summary is:
SERVER:
Hadoop 2.4.0 with pseudo-distribution (http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-common/SingleCluster.html)
CentOS 6.5 Virtual Machine 64 bit server Java 1.7.0_55
CLIENT:
Windows 8 (Virtual Machine) Java 1.7.0_51
Any help is greatly appreciated.
Upvotes: 6
Views: 14667
Reputation: 94
We need to make sure to have configuration with fs.default.name space set such as
configuration.set("fs.default.name","hdfs://ourHDFSNameNode:50000");
Below I've put a piece of sample code:
Configuration configuration = new Configuration();
configuration.set("fs.default.name","hdfs://ourHDFSNameNode:50000");
FileSystem fs = pt.getFileSystem(configuration);
BufferedReader br = new BufferedReader(new InputStreamReader(fs.open(pt)));
String line = null;
line = br.readLine
while (line != null) {
try {
line = br.readLine
System.out.println(line);
}
}
Upvotes: 2
Reputation: 2202
The answer above is pointing to the right direction. Allow me to add the following:
You were able to list directory contents because hostname:9000
was accessible to your client code. You were doing the number 2 above.
To be able to read and write, your client code needs access to the Datanode (number 3). The default port for Datanode DFS data transfer is 50010. Something was blocking your client communication to hostname:50010
. Possibly a firewall or SSH tunneling configuration problem.
I was using Hadoop 2.7.2, so maybe you have a different port number setting.
Upvotes: 2
Reputation: 3462
Hadoop error messages are frustrating. Often they don't say what they mean and have nothing to do with the real issue. I've seen problems like this occur when the client, namenode, and datanode cannot communicate properly. In your case I would pick one of two issues:
The host name "test.server" is very suspicious. Check all of the following:
Any inconsistency in the use of FQDN, hostname, numeric IP, and localhost must be removed. Do not ever mix them in your conf files or in your client code. Consistent use of FQDN is preferred. Consistent use of numeric IP usually also works. Use of unqualified hostname, localhost, or 127.0.0.1 cause problems.
Upvotes: 2