Rohit Pandey
Rohit Pandey

Reputation: 2681

Reading data from HDFS - my program can't find the path

I'm trying to read the contents of a file from HDFS. My code is below -

package gen;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

public class ReadFromHDFS {

 public static void main(String[] args) throws Exception {

  if (args.length < 1) {
   System.out.println("Usage: ReadFromHDFS <hdfs-file-path-to-read-from>");
   System.out.println("Example: ReadFromHDFS 'hdfs:/localhost:9000/myFirstSelfWriteFile'");
   System.exit(-1);
  } 

  try {
   Path path = new Path(args[0]);
   FileSystem fileSystem = FileSystem.get(new Configuration());
   BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(fileSystem.open(path)));
   String line = bufferedReader.readLine();
   while (line != null) {
    System.out.println(line);
    line = bufferedReader.readLine();
   }
  } catch (IOException e) {
   e.printStackTrace();
  }
 }
}

However, I can't figure out how to give this program the path to my HDFS directory. I have tried -

java -cp <hadoop jar:myjar> gen.ReadFromHDFS <path>

where with path I tried referencing the directory directly (what I see when I do hadoop fs -ls), the file inside the directory, adding hdfs:/localhost, hdfs:/ and none of them work. Can any one help me with how exactly I should pass the path of my folder to HDFS? For example, when I give the path directly (with no prefix) it says that the file does not exist.

Edit: None of the solutions so far seem to work for me. I always get the exception -

  java.io.FileNotFoundExceptoin: File <filename> does not exist.
  at org.apache.hadoop.fs.getFileSystem.getFileStatus(RawLocalFileSystem.java:361)

It seems to be trying to find the file locally.

Upvotes: 2

Views: 4515

Answers (3)

Y.Prithvi
Y.Prithvi

Reputation: 1221

try

FileSystem fileSystem = FileSystem.get(new Configuration());
Path path = new Path(fileSystem.getName() + "/" + args[0]);
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(fileSystem.open(path)));
String line = bufferedReader.readLine();

and give file path in HDFS as (with no prefix)

"/myFirstSelfWriteFile"

do not include "hdfs:/localhost"

Upvotes: 2

SachinJose
SachinJose

Reputation: 8522

Looks like you are missing one / in your path, should give two /'s after filesystem. Try specifying the following path

hdfs://localhost:9000/myFirstSelfWriteFile

Upvotes: 0

Chris Gerken
Chris Gerken

Reputation: 16392

You need to be using the classes in package org.apache.hadoop.fs (FileSystem, FSDataInputStream, FSDataOutputStream and Path). There are several articles out there, but I'd use this one from the Hadoop Wiki

Upvotes: 0

Related Questions