Reputation: 118962
Just finished reading the following paper on MapReduce. One question - does reduce wait until all map operations are finished or does can it start once some results are available?
Upvotes: 3
Views: 1050
Reputation: 1353
In a MapReduce job reducers do not start executing the reduce method until the all Map jobs have completed. Reducers start copying intermediate key-value pairs from the mappers as soon as they are available.That's why we are able to see in the Job-tracker that reduce show few % when some maps are still running ..
Upvotes: 5
Reputation: 225
Haskell has map and reduce (they call it fold) built-in, and its execution order is undefined (you can even operate on infinite lists, as long as you don't try to evaluate the whole thing). So you could do it either way.
If you're asking how Google did it, I don't know for sure, but they probably set it up so the reduce consumes the list they're mapping over to the maximum extent possible, since that way they don't have to keep the already-processed values in memory.
Upvotes: 1