avinash
avinash

Reputation: 147

In MapReduce, how does reduce task differ from reducer

In Mapreduce, How does Reduce task differ from Reducer?

What is the correlation between reduce task and reducer?

Does Reducer perform the reduce task?

Many Thanks

Upvotes: 1

Views: 151

Answers (2)

Ravindra babu
Ravindra babu

Reputation: 38910

From Apache documentation,

Reducer reduces a set of intermediate values which share a key to a smaller set of values.

Reducer has 3 primary phases:

Shuffle

Reducer is input the grouped output of a Mapper. In the phase the framework, for each Reducer, fetches the relevant partition of the output of all the Mappers, via HTTP.

Sort

The framework groups Reducer inputs by keys (since different Mappers may have output the same key) in this stage.

Reduce

In this phase the reduce(Object, Iterator, OutputCollector, Reporter) method is called for each pair in the grouped inputs.

The output of the Reduce task is typically written to the FileSystem via OutputCollector.collect(Object, Object).

Note that apart from Reducer, Combiner also invoke reduce function since it is implementing Reducer interface.

Reducer is a class, which contain reduce function as below

    protected void reduce(KEYIN key, Iterable<VALUEIN> values, Context context
                        ) throws IOException, InterruptedException {

Reduce task is program running on a node, which is executing reduce function of Reducer class.

Upvotes: 1

madhu
madhu

Reputation: 1170

Reduce task is simply an instance of the Reducer.

The number of reduce tasks is configurable.

Either it can be specified by setting property mapred.reduce.tasks in the job configuration object

or

org.apache.hadoop.mapreduce.Job#setNumReduceTasks(int reducerCount); method can be used.

Upvotes: 1

Related Questions