Krishna Kalyan
Krishna Kalyan

Reputation: 1702

How to find number of MapTasks spawned?

What is the relation between Block Size, Splits and number of MapTasks ?. How are the map tasks invoked?.

Upvotes: 0

Views: 58

Answers (2)

Thejas
Thejas

Reputation: 61

As the above answers is incomplete, Also Consider if the file being used by your Mapred task is split-able by nature. Files having gzip encoding are not split-able by nature and irrespective of the Block-Size and Input-Split size one Mapper is spawned per gzfile.

Precedence of choosing the Number of Mappers:

so if File is not splitable - 1 mapper per file

if Splitsize >= Blocksize - 1 mapper per block

if split-size < Block-size - Block-size/Split-size Mappers per file.

Upvotes: 1

John B
John B

Reputation: 32949

By default there will be exactly one InputSplit per block and hence one Map task per block. To change this behavior you must change the upper size limit of the input format to be less than the max block size.

To find the number of Map tasks produced (after Maps have run) you could use a Counter and increment it in the setup method.

Upvotes: 0

Related Questions