Casebash
Casebash

Reputation: 118962

Does reduce in MapReduce run right away, or wait for map to complete?

Just finished reading the following paper on MapReduce. One question - does reduce wait until all map operations are finished or does can it start once some results are available?

Upvotes: 3

Views: 1050

Answers (2)

saurabh shashank
saurabh shashank

Reputation: 1353

In a MapReduce job reducers do not start executing the reduce method until the all Map jobs have completed. Reducers start copying intermediate key-value pairs from the mappers as soon as they are available.That's why we are able to see in the Job-tracker that reduce show few % when some maps are still running ..

Upvotes: 5

Michael Shaw
Michael Shaw

Reputation: 225

Haskell has map and reduce (they call it fold) built-in, and its execution order is undefined (you can even operate on infinite lists, as long as you don't try to evaluate the whole thing). So you could do it either way.

If you're asking how Google did it, I don't know for sure, but they probably set it up so the reduce consumes the list they're mapping over to the maximum extent possible, since that way they don't have to keep the already-processed values in memory.

Upvotes: 1

Related Questions