Reputation: 21
I was wondering if it is possible to define a Hierarchical MapReduce job?. In other words I would like to have a map-reduce job, that in the mapper phase will call a different MapReduce job. Is it possible? Do you have any recommendations how to do it?
I want to do it in order to have additional level of parallelism/distribution in my program. Thanks, Arik.
Upvotes: 1
Views: 554
Reputation: 56
I guess you need oozie tool. Oozie helps in defining workflows using an xml file.
Upvotes: 0
Reputation: 13046
Hadoop definitive guide book contains lot of recipes related to MapReduce job chaining including sample code and detailed explanation. Especially chapter called like 'advanced API usage' or something near it.
I personally succeeded with replacement of complex map-reduce job with several HBase tables used as sources with handmade TableInputFormat
extension. The result was input format which combines source data with minimal reduction so job was transformed to single mapper step. So I recommend you to look in this direction too.
Upvotes: 2