Reputation: 3379
I want to read multiple files from multiple directories in Map-Reduce program. I have tried to give the filename in main method:
FileInputFormat.setInputPaths(conf,new Path("hdfs://localhost:54310/user/test/"));
FileInputFormat.setInputPaths(conf,new Path("hdfs://localhost:54310/Test/test1/"));
But it is reading from just one file.
What should I do for reading multiple files?
Please suggest a solution.
Thanks.
Upvotes: 2
Views: 5121
Reputation: 179
Follow the below steps for passsing multiple input files from different direcories.Just driver code changes.Follow the below driver code.
CODE:
public int run(String[] args) throws Exception {
Configuration conf=new Configuration();
Job job=Job.getInstance(conf, "MultipleDirectoryAsInput");
job.setMapperClass(Map1Class.class);
job.setMapperClass(Map2Class.class);
job.setReducerClass(ReducerClass.class);
job.setJarByClass(DriverClass.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(NullWritable.class);
//FileInputFormat.setInputPaths(job, new Path(args[0]));
MultipleInputs.addInputPath(job, new Path(args[0]),TextInputFormat.class,Map1Class.class);
MultipleInputs.addInputPath(job, new Path(args[1]), TextInputFormat.class, Map2Class.class);
FileOutputFormat.setOutputPath(job, new Path(args[2]));
return job.waitForCompletion(true)?0:1;
}
Upvotes: 0
Reputation: 33495
FileInputFormat#setInputPaths
will set the input paths after overriding the input paths set earlier. Use FileInputFormat#addInputPath
or FileInputFormat#addInputPaths
to add to the existing path.
Upvotes: 6