Reputation: 2681
I'm trying to read the contents of a file from HDFS. My code is below -
package gen;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class ReadFromHDFS {
public static void main(String[] args) throws Exception {
if (args.length < 1) {
System.out.println("Usage: ReadFromHDFS <hdfs-file-path-to-read-from>");
System.out.println("Example: ReadFromHDFS 'hdfs:/localhost:9000/myFirstSelfWriteFile'");
System.exit(-1);
}
try {
Path path = new Path(args[0]);
FileSystem fileSystem = FileSystem.get(new Configuration());
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(fileSystem.open(path)));
String line = bufferedReader.readLine();
while (line != null) {
System.out.println(line);
line = bufferedReader.readLine();
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
However, I can't figure out how to give this program the path to my HDFS directory. I have tried -
java -cp <hadoop jar:myjar> gen.ReadFromHDFS <path>
where with path I tried referencing the directory directly (what I see when I do hadoop fs -ls), the file inside the directory, adding hdfs:/localhost, hdfs:/ and none of them work. Can any one help me with how exactly I should pass the path of my folder to HDFS? For example, when I give the path directly (with no prefix) it says that the file does not exist.
Edit: None of the solutions so far seem to work for me. I always get the exception -
java.io.FileNotFoundExceptoin: File <filename> does not exist.
at org.apache.hadoop.fs.getFileSystem.getFileStatus(RawLocalFileSystem.java:361)
It seems to be trying to find the file locally.
Upvotes: 2
Views: 4515
Reputation: 1221
try
FileSystem fileSystem = FileSystem.get(new Configuration());
Path path = new Path(fileSystem.getName() + "/" + args[0]);
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(fileSystem.open(path)));
String line = bufferedReader.readLine();
and give file path in HDFS as (with no prefix)
"/myFirstSelfWriteFile"
do not include "hdfs:/localhost"
Upvotes: 2
Reputation: 8522
Looks like you are missing one / in your path, should give two /'s after filesystem. Try specifying the following path
hdfs://localhost:9000/myFirstSelfWriteFile
Upvotes: 0
Reputation: 16392
You need to be using the classes in package org.apache.hadoop.fs (FileSystem, FSDataInputStream, FSDataOutputStream and Path). There are several articles out there, but I'd use this one from the Hadoop Wiki
Upvotes: 0