Reputation: 87
I am learning Hadoop.
I am running Hadoop on single node.
According to my knowledge Reducer runs after completion of Mapper (and it makes sense as well).
But when i ran MapReduce job on 200MB file, Reducer started before completion of Mapper. I didn't use any Combiner.
Can anyone explain why?
Upvotes: 0
Views: 146
Reputation: 20969
The reduce phase involves copying and merging the output of the data from the mappers to the reducer.
Since copying and merging intermediate outputs does not need a barrier (you don't need to wait on all mappers to complete), that's what the reducer is doing while the mappers run.
Upvotes: 1