Bhavesh Shah
Bhavesh Shah

Reputation: 3379

How to read multiple files from multiple directories in Map-Reduce

I want to read multiple files from multiple directories in Map-Reduce program. I have tried to give the filename in main method:

FileInputFormat.setInputPaths(conf,new Path("hdfs://localhost:54310/user/test/"));
FileInputFormat.setInputPaths(conf,new Path("hdfs://localhost:54310/Test/test1/"));

But it is reading from just one file.

What should I do for reading multiple files?

Please suggest a solution.

Thanks.

Upvotes: 2

Views: 5121

Answers (2)

Basapuram Kumar
Basapuram Kumar

Reputation: 179

Follow the below steps for passsing multiple input files from different direcories.Just driver code changes.Follow the below driver code.
CODE:
public int run(String[] args) throws Exception {
        Configuration conf=new Configuration();
        Job job=Job.getInstance(conf, "MultipleDirectoryAsInput");

        job.setMapperClass(Map1Class.class);
        job.setMapperClass(Map2Class.class);
        job.setReducerClass(ReducerClass.class);        
         job.setJarByClass(DriverClass.class);      
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);      
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(NullWritable.class);        
        //FileInputFormat.setInputPaths(job, new Path(args[0]));        
        MultipleInputs.addInputPath(job, new Path(args[0]),TextInputFormat.class,Map1Class.class);
        MultipleInputs.addInputPath(job, new Path(args[1]), TextInputFormat.class, Map2Class.class);            
        FileOutputFormat.setOutputPath(job, new Path(args[2])); 
        return job.waitForCompletion(true)?0:1;     
    }

Upvotes: 0

Praveen Sripati
Praveen Sripati

Reputation: 33495

FileInputFormat#setInputPaths will set the input paths after overriding the input paths set earlier. Use FileInputFormat#addInputPath or FileInputFormat#addInputPaths to add to the existing path.

Upvotes: 6

Related Questions