Reputation: 1702
What is the relation between Block Size, Splits and number of MapTasks ?. How are the map tasks invoked?.
Upvotes: 0
Views: 58
Reputation: 61
As the above answers is incomplete, Also Consider if the file being used by your Mapred task is split-able by nature. Files having gzip encoding are not split-able by nature and irrespective of the Block-Size and Input-Split size one Mapper is spawned per gzfile.
Precedence of choosing the Number of Mappers:
so if File is not splitable - 1 mapper per file
if Splitsize >= Blocksize - 1 mapper per block
if split-size < Block-size - Block-size/Split-size Mappers per file.
Upvotes: 1
Reputation: 32949
By default there will be exactly one InputSplit per block and hence one Map task per block. To change this behavior you must change the upper size limit of the input format to be less than the max block size.
To find the number of Map tasks produced (after Maps have run) you could use a Counter and increment it in the setup
method.
Upvotes: 0