Reputation: 14664
I have five map reduce that I am running each separately. I want to pipeline them all together. So, output of one job goes to next job. Currently, I wrote shell script to execute them all. Is there a way to write this in java? Please provide an example.
Thanks
Upvotes: 1
Views: 3544
Reputation: 787
For your use case, I think Oozie will be good. Oozie is a workflow scheduler in which you can write different actions(can be map-reduce, java, shell, etc) to perform some compute, transformation, enrichment, etc. For this case :
action A : i/p input o/p a
action B : i/p a o/p b
action C : i/p b o/p c(final output)
You can finally persist c in HDFS, and can decide to persist or delete intermediate outputs.
If you want to do the computation done by all three actions in a single one then you can use Cascading. You can understand better about Cascading by their official documentation, and you can also refer my blog on same : https://tech.flipkart.com/expressing-etl-workflows-via-cascading-192eb5e7d85d
Upvotes: 0
Reputation: 4236
You may find JobControl to be the simplest method for chaining these jobs together. For more complex workflows, I'd recommend checking out Oozie.
Upvotes: 3
Reputation: 22905
Another possibility is Cascading, which also provides an abstraction layer on top of Hadoop: itseems to provide a similar combination of working-closely-with-Hadoop-concepts yet letting-hadoop-do-the-M/R-heavy lifting that one gets using Oozie workflows calling Pig scripts.
Upvotes: 0
Reputation: 137
Oozie is the solution for you. You can submit map-reduce types of jobs, hive jobs, pig jobs, system commands etc through Oozie's action tags.
It even has a co-ordinator which acts as a cron for your workflow.
Upvotes: 1
Reputation: 71
Hi I had similar requirement One way to do this is
after submitting first job execute following
Job job1 = new Job( getConf() );
job.waitForCompletion( true );
and then check for status using
if(job.isSuccessful()){
//start another job with different Mapper.
//change config
Job job2 = new Job( getConf() );
}
Upvotes: 2