HDFS has file but java.io.FileNotFoundException happens

Question

I am running MapReduce program on Hadoop.

The inputformat passes each file path to mapper.

I can check the file through cmd like this,

$ hadoop fs -ls hdfs://slave1.kdars.com:8020/user/hadoop/num_5/13.pdf

Found 1 items -rwxrwxrwx 3 hdfs hdfs 184269 2015-03-31 22:50 hdfs://slave1.kdars.com:8020/user/hadoop/num_5/13.pdf

However when I try to open that file from the mapper side, it is not working.

15/04/01 06:13:04 INFO mapreduce.Job: Task Id : attempt_1427882384950_0025_m_000002_2, Status : FAILED Error: java.io.FileNotFoundException: hdfs:/slave1.kdars.com:8020/user/hadoop/num_5/13.pdf (No such file or directory)

at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.(FileInputStream.java:146)
at java.io.FileInputStream.(FileInputStream.java:101)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1111)

I checked that inputformat work fine and mapper have got right file path. mapper code look like this,

@Override
public void map(Text title, Text file, Context context) throws IOException, InterruptedException {

    long time = System.currentTimeMillis(); 
    SimpleDateFormat dayTime = new SimpleDateFormat("yyyy-mm-dd hh:mm:ss");
    String str = dayTime.format(new Date(time));

    File temp = new File(file.toString());
    if(temp.exists()){
        DBManager.getInstance().insertSQL("insert into `plagiarismdb`.`workflow` (`type`) value ('"+temp+" is exists')");
    }else{
        DBManager.getInstance().insertSQL("insert into `plagiarismdb`.`workflow` (`type`) value ('"+temp+" is not exists')");
    }
}

Help me please.

Sravan K Reddy · Accepted Answer

First, import these.

import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

Then, use them in your mapper method.

FileSystem fs = FileSystem.get(new Configuration());

Path path=  new Path(value.toString());
System.out.println(path);

if (fs.exists(path)) {
    context.write(value, one);
} else {
    context.write(value, zero);
}

HDFS has file but java.io.FileNotFoundException happens

Answers (1)

Related Questions