By-passing the shuffling stage of Mapreduce job in hadoop?

Question

I am trying to implement an algorithm where only single reducer is required and mapreduce job is executing iteratively. Result of each mapper in particular iteration is to be added in reducer and then processed. Then output of the reducer is passed as input to mapper in other iteration. I want to execute the job in asynchronous manner i.e. as soon as pre-defined number of mappers are executed, pass the output directly to the reducer i.e. avoiding the shuffling and sorting as its creating only overhead for my algorithm. Is that even possible? If not, what can be done for asynchronous exceution of mapreduce job at implementation level. I went to number of research papers but unable to get any idea from there.

Thanks.

By-passing the shuffling stage of Mapreduce job in hadoop?

Answers (1)

Related Questions