Reputation: 582
I have a situation, during one POC I want to create a nested MapReduce within one Job. Like a Map M1 O/P to Reducer R1 O/P then that R1 output goes to M2 and final output will come with either M2 or we can run R2 with M2 O/P.
Single Job ID - M1->R1->M2->R2...Final output will be in a single O/P file.
Can we do it without Oozie?
Upvotes: 1
Views: 78
Reputation: 1170
You can also make use of job control where in you can execute a number of map reduce jobs in a sequence. In your case there are two mappers and two or one reducers. You can have two map reduce jobs and for the second job you can use set the number of reducers to zero if you don't require reducers.
Upvotes: 0
Reputation: 13402
You can chain multiple jobs in your Driver class. First, create a job for first MapReduce, by defining all the required configuration. Then start the job as usual by calling:
job1.waitForCompletion(true);
This is wait until the job is finished. Now check the final status of the first job, whether failed or succeeded for appropriate next action.
If the first job is completed successfully, then launch the next MapReduce in the same way. First define the required parameters and launch the job with:
job2.waitForCompletion(true);
The important thing will be output path of the first will be input for the second job. This is serial (sequential) job chaining, because both the jobs will be running one after another.
Upvotes: 1