Venk K
Venk K

Reputation: 1177

Getting the number of input files added to a Hadoop MR job

How do I get the number of input files I have added as part of the calls to FileInputFormat.addInputPath and FileInputFormat.addInputPaths. I am trying to add input files matching some pattern and in cases where no file matches the pattern and there are no input files for this MR job, I want to log a message to the user and not submit the job at all.

Thanks,

Venkat

Upvotes: 1

Views: 468

Answers (1)

Charles Menguy
Charles Menguy

Reputation: 41428

FileInputFormat stores data in the Configuration variable called mapred.input.dir, so you can use the following:

Configuration conf = job.getConfiguration();
String dirs = conf.get("mapred.input.dir");
String[] arrDirs = dirs.split(",");
int numDirs = arrDirs.length;

The relevant part of the source code that does this is:

conf.set("mapred.input.dir", dirs == null ? dirStr : dirs + "," + dirStr);

Upvotes: 4

Related Questions