Reputation: 2077
Suppose I have two dataset :
hello world
bye world
and
hello earth
new earth
and I want to run a map-reduce task which does not specify mapper class or reducer class, So the default mapper and reducer will be called - which both are identity function. When I run the job the output is ::
0 hello world
0 hello earth
12 new earth
12 bye world
I am confused why the key is like 0 and 12 ?! I just used default mapper and reducer as I commented out these lines in the main()
::
// job.setMapperClass(Map.class);
// job.setCombinerClass(Reduce.class);
// job.setReducerClass(Reduce.class);
So, my question is what is the output key is here ? why it looks like 0, 0 , 12, 12 ?
Upvotes: 0
Views: 1980
Reputation: 33545
0,0,12 and 12 are the file offsets in the input data. In the case of text inputs the K to the mapper is the file offset and the value is the input line. Check this for more information.
Upvotes: 2