stholy
stholy

Reputation: 322

FileNotFoundException on hadoop

Inside my map function, I am trying to read a file from the distributedcache, load its contents into a hash map.

The sys output log of the MapReduce job prints the content of the hashmap. This shows that it has found the file, has loaded into the data structure and performed the needed operation. It iterates through the list and prints its contents. Thus proving that the operation was successful.

However, I still get the below error after a few minutes of running the MR job:

13/01/27 18:44:21 INFO mapred.JobClient: Task Id : attempt_201301271841_0001_m_000001_2, Status : FAILED
java.io.FileNotFoundException: File does not exist: /app/hadoop/jobs/nw_single_pred_in/predict
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)
    at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.(DFSClient.java:1834)
    at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)
    at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
    at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:67)
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)

Here's the portion which initializes Path with the location of the file to be placed in the distributed cache


    // inside main, surrounded by try catch block, yet no exception thrown here
        Configuration conf = new Configuration();
        // rest of the stuff that relates to conf
        Path knowledgefilepath = new Path(args[3]); // args[3] = /app/hadoop/jobs/nw_single_pred_in/predict/knowledge.txt
        DistributedCache.addCacheFile(knowledgefilepath.toUri(), conf);
        job.setJarByClass(NBprediction.class);
        // rest of job settings 
        job.waitForCompletion(true); // kick off load

This one is inside the map function:


    try {
    System.out.println("Inside try !!");
    Path files[]= DistributedCache.getLocalCacheFiles(context.getConfiguration());
    Path cfile = new Path(files[0].toString()); // only one file
    System.out.println("File path : "+cfile.toString());
    CSVReader reader = new CSVReader(new FileReader(cfile.toString()),'\t');
    while ((nline=reader.readNext())!=null)
    data.put(nline[0],Double.parseDouble(nline[1])); // load into a hashmap
    }
    catch (Exception e)
    {// handle exception }

Help appreciated.

Cheers !

Upvotes: 0

Views: 1209

Answers (1)

stholy
stholy

Reputation: 322

Did a fresh installation of hadoop and ran the job with the same jar, the problem disappeared. Seems to be a bug rather than programming errors.

Upvotes: 0

Related Questions