Reputation: 31262
Here is the source code for Mapper
public void run(Context context) throws IOException, InterruptedException {
setup(context);
while (context.nextKeyValue()) {
map(context.getCurrentKey(), context.getCurrentValue(), context);
}
cleanup(context);
}
}
As you can notice, context
is used both for read
and write
. How is it possible?
That is context.getCurrentKey()
and context.getCurrentValue()
are used to retrieve key and value pair from context and is passed to map function. Is it same context
used for Input and Output?
Upvotes: 0
Views: 3644
Reputation: 860
Yes, the same context
is for both input and output. It stores references to RecordReader
and RecordWriter
. Whenever context.getCurrentKey()
and context.getCurrentValue()
are used to retrieve key and value pair, the request is delegated to RecordReader
. And when context.write()
is called, it is delegated to RecordWriter
.
Note that RecordReader
and RecordWriter
are actually abstract classes.
Update:
org.apache.hadoop.mapreduce.Mapper$Context
implements org.apache.hadoop.mapreduce.MapContext
, which again sub classes org.apache.hadoop.mapreduce.TaskInputOutputContext
Look at the source of org.apache.hadoop.mapreduce.task.MapContextImpl
. which is again a subclass of org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl
to see where exactly Context
delegates input and output to RecordReader
and RecordWriter
.
Upvotes: 4