How can i have multiple mappers and reducers?

Question

I have this code in which i have set one mapper and one reducer.I want to include one more mapper and a reducer for doing further jobs. The problem is that i have to take the output file of the first map reduce job as the input to the next map reduce job.Is it possible to do that?if yes then how can i do it?

public int run(String[] args) throws Exception 
          {
            JobConf conf = new JobConf(getConf(),DecisionTreec45.class);
            conf.setJobName("c4.5");

            // the keys are words (strings)
            conf.setOutputKeyClass(Text.class);
            // the values are counts (ints)
            conf.setOutputValueClass(IntWritable.class);
            conf.setMapperClass(MyMapper.class);
            conf.setReducerClass(MyReducer.class);


            //set your input file path below
            FileInputFormat.setInputPaths(conf, "/home/hduser/Id3_hds/playtennis.txt");
            FileOutputFormat.setOutputPath(conf, new Path("/home/hduser/Id3_hds/1/output"+current_index));
            JobClient.runJob(conf);
            return 0;
          }

alekya reddy · Accepted Answer

yes its possible to do that. you can check the following tutorial to see how chaining occurs. http://gandhigeet.blogspot.com/2012/12/as-discussed-in-previous-post-hadoop.html

Make sure you delete the intermediate output data in HDFS which will be created by each MR phase by using fs.delete(intermediateoutputPath);

How can i have multiple mappers and reducers?

Answers (2)

Related Questions