Reputation: 1840
Hadoop the definitive guide (Tom White) Page 178 Section shuffle and sort : The map side. Just after figure 6-4
Before it writes to disk , the thread first divides the data into partitions corresponding to the reducers that they will ultimately be sent to. WIthin each partition, the background thread performs an in-memory sort by key and if there is a combiner function, it is run on the output of the sort.
Question :
Does this mean the map writes each key output to a different file and then combine them later. Thus if there were 2 different key outputs to be sent to a reducer , each different key will be sent seperately to the reducer instead of sending a single file.
If my above reasoning is incorrect, what is it that actually happens.
Upvotes: 4
Views: 2883
Reputation: 1544
Only if the two key outputs are going to different reducers. If the partition thinks they should go to the same reducer they will be in the same file.
-- Updated to include more details - Mostly from the book:
The partitioner just sorts the keys in to buckets. 0 to n for the number of reducers in your job. The reduce task has a small number of copier threads so that it can fetch map outputs in parallel. Therefore, for a given job, the jobtracker knows the mapping between map outputs and hosts. A thread in the reducer periodically asks the master for map output hosts until it has retrieved them all.
The map outputs are copied to the reduce task JVM’s memory if they are small enough (the buffer’s size is controlled by mapred.job.shuffle.input.buffer.percent, which specifies the proportion of the heap to use for this purpose); otherwise, they are copied to disk. When the in-memory buffer reaches a threshold size (controlled by mapred.job.shuffle.merge.percent) or reaches a threshold number of map outputs (mapred.inmem.merge.threshold), it is merged and spilled to disk. If a combiner is specified, it will be run during the merge to reduce the amount of data written to disk.
As the copies accumulate on disk, a background thread merges them into larger, sorted files. This saves some time merging later on. Note that any map outputs that were compressed (by the map task) have to be decompressed in memory in order to perform a merge on them.
When all the map outputs have been copied, the reduce task moves into the sort phase (which should properly be called the merge phase, as the sorting was carried out on the map side), which merges the map outputs, maintaining their sort ordering. This is done in rounds. For example, if there were 50 map outputs and the merge factor was 10 (the default, controlled by the io.sort.factor property, just like in the map’s merge), there would be five rounds. Each round would merge 10 files into one, so at the end there would be five intermediate files.
Rather than have a final round that merges these five files into a single sorted file, the merge saves a trip to disk by directly feeding the reduce function in what is the last phase: the reduce phase. This final merge can come from a mixture of in-memory and on-disk segments.
Upvotes: 4
Reputation: 5063
Say, you have 3 reducers running. You can then use a partitioner to decide which keys goes to which of the three reducers. You can probably do a X%3 in the partitioner to decide which key goes to which reducer. Hadoop by default uses HashPartitioner.
Upvotes: 1
Reputation: 6686
If we have configured multiple reducer, then during partitioning if we get keys for different reducer, they will be stored in separate files corresponding to reducer, and at the end of map task complete file will be send to reducer and not single key.
Upvotes: 1