dinesh
dinesh

Reputation: 465

Reading HDFS through java API

[possible duplicate]I am trying to read HDFS using java API. Using command line and Hadoop Url It's working fine but problem is reading while hdfs paths.I went through this Reading HDFS and local files in Java but I am not able to find where i am wrong.

1)command line gives this result

  hduser@hduser-Satellite:~$ hadoop fs -ls 
  Found 3 items
  drwxr-xr-x   - hduser supergroup          0 2014-01-11 00:21 /user/hduser/In
  -rw-r--r--   1 hduser supergroup   37461150 2014-01-11 17:27 /user/hduser/loging.txt
  -rw-r--r--   3 hduser supergroup  112383446 2014-01-11 19:02 /user/hduser/loging1.txt

2)

       static {
               URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory());
          }
          InputStream in = null;
    try {
*********************This is working fine********************************
        in = new URL("hdfs://localhost:54310/user/hduser/loging.txt")
                .openStream();
*************************************************************************

*************************This is not working*****************************
       in = new URL("hdfs://localhost/user/hduser/loging.txt").openStream(); 

       It says:
       14/01/11 19:54:55 INFO ipc.Client: Retrying connect to server:
       localhost/127.0.0.1:8020. Already tried 0 time(s).
                              .
                              .
       14/01/11 19:55:04 INFO ipc.Client: Retrying connect to server:    
       localhost/127.0.0.1:8020. Already tried 9 time(s).
       Exception in thread "main" java.net.ConnectException: Call to      
      localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused
**************************************************************************      

         IOUtils.copyBytes(in, System.out, 4096, false);

    } finally {
        IOUtils.closeStream(in);
    }

3)Giving Exception

    Configuration configuration = new Configuration();
    configuration.addResource(new Path("/hadoop/conf/core-site.xml"));
    configuration.addResource(new Path("/hadoop/conf/hdfs-site.xml"));

    FileSystem fileSystem = FileSystem.get(configuration);
    System.out.println(fileSystem.getHomeDirectory());
    Path path = new Path("/user/hduser/loging.txt");

    FSDataInputStream in = fileSystem.open(path);
    System.out.println(in);
    byte[] b = new byte[1024];
    int numBytes = 0;
    while ((numBytes = in.read(b)) > 0) {
        //processing
    }

    in.close();

    fileSystem.close();
Exception:
file:/home/hduser
Exception in thread "main" java.io.FileNotFoundException: File /user/hduser/loging.txt does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:125)
at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356)
at HdfsUrl.read(HdfsUrl.java:29)
at HdfsUrl.main(HdfsUrl.java:58)

Upvotes: 0

Views: 1658

Answers (1)

Ashish
Ashish

Reputation: 1121

The kind of exception you are getting suggests thats your application is trying to connect to a datanode that does not have 8020 port open. As per this documentation 8020 is the default port namenode. I'd suggest adding your hostname and port info. in your core-site.xml. something like this:

<property>
    <name>fs.default.name</name>
    <value>hdfs://[namenode]:[port]</value>
</property>

Upvotes: 1

Related Questions