G Krishna Sampath
G Krishna Sampath

Reputation: 97

Independent map reduce jobs to executed one after other

Is that possible to execute Independent map reduce jobs (not in chaining where the output of reducer

  1. Becomes input of mapper.
  2. That can be executed one after the other.

Upvotes: 1

Views: 151

Answers (3)

d pavan kumar reddy
d pavan kumar reddy

Reputation: 1

You can go with Parallel job running. Sample code is given below

Configuration conf = new Configuration();
Path Job1InputDir = new Path(args[0]);
Path Job2InputDir = new Path(args[1]);
Path Job1OutputDir = new Path(args[2]);
Path Job2OutputDir = new Path(args[3]);
Job Job1= submitJob(conf, Job1InputDir , Job1OutputDir );
Job Job2= submitJob(conf, Job2InputDir , Job2OutputDir );
// While both jobs are not finished, sleep
while (!Job1.isComplete() || !Job2.isComplete()) {
Thread.sleep(5000);
}
if (Job1.isSuccessful()) {
System.out.println(" job1 completed successfully!");
} else {
System.out.println(" job1 failed!");
}
if (Job2.isSuccessful()) {
System.out.println("Job2 completed successfully!");
} else {
System.out.println("Job2 failed!");
}
System.exit(Job1.isSuccessful() &&
Job2.isSuccessful() ? 0 : 1);
}

Upvotes: 0

Sravan K Reddy
Sravan K Reddy

Reputation: 1082

in your driver code call two methods runfirstjob,runsecondjob.just like this.this is just a hint, do modification according to your need

public class ExerciseDriver {


static Configuration conf;

public static void main(String[] args) throws Exception {

    ExerciseDriver ED = new ExerciseDriver();
    conf = new Configuration();
    FileSystem fs = FileSystem.get(conf);

    if(args.length < 4) {
        System.out.println("Too few arguments. Arguments should be:  <hdfs input folder> <hdfs output folder> <N configurable Integer Value>");
        System.exit(0);
    }

    String pathin1stmr = args[0];
    String pathout1stmr = args[1];
    String pathin2ndmr = args[2];
    String pathout2ndmr = args[3];

    ED.runFirstJob(pathin1stmr, pathout1stmr);

    ED.runSecondJob(pathin2ndmr, pathout2ndmr);

}

public int runFirstJob(String pathin, String pathout)  

 throws Exception {

    Job job = new Job(conf);
    job.setJarByClass(ExerciseDriver.class);
    job.setMapperClass(ExerciseMapper1.class);
    job.setCombinerClass(ExerciseCombiner.class);
    job.setReducerClass(ExerciseReducer1.class);
    job.setInputFormatClass(ParagrapghInputFormat.class);
    job.setOutputFormatClass(TextOutputFormat.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class); 
    FileInputFormat.addInputPath(job, new Path(pathin));
    FileOutputFormat.setOutputPath(job, new Path(pathout));

   job.submit();  

   job.getMaxMapAttempts();

   /*
   JobContextImpl jc = new JobContextImpl();
   TaskReport[] maps = jobclient.getMapTaskReports(job.getJobID());

    */

    boolean success = job.waitForCompletion(true);
    return success ? 0 : -1;

}

  public int runSecondJob(String pathin, String pathout) throws Exception { 
    Job job = new Job(conf);
    job.setJarByClass(ExerciseDriver.class);
    job.setMapperClass(ExerciseMapper2.class);
    job.setReducerClass(ExerciseReducer2.class);
    job.setInputFormatClass(KeyValueTextInputFormat.class);
    job.setOutputFormatClass(TextOutputFormat.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(Text.class);    
    FileInputFormat.addInputPath(job,new Path(pathin));
    FileOutputFormat.setOutputPath(job, new Path(pathout));
    boolean success = job.waitForCompletion(true);
    return success ? 0 : -1;
}

 }

Upvotes: 1

Vanaja Jayaraman
Vanaja Jayaraman

Reputation: 781

If you want to execute one after another, then you can chain you jobs as per the below link:

http://unmeshasreeveni.blogspot.in/2014/04/chaining-jobs-in-hadoop-mapreduce.html

Upvotes: 0

Related Questions