justin waugh
justin waugh

Reputation: 895

Can a mapper know how many mappers are running?

In a hadoop Map-Reduce framework, when an application is running, is it possible to know the number of workers running in the program. The number of workers is same as the number of file splits so in other words, is it possible to know the number of file splits dynamically?

Upvotes: 0

Views: 205

Answers (1)

Chris White
Chris White

Reputation: 30089

The total number of map tasks and reducer tasks that make up the job can be queried via the mapred.map.tasks and mared.reduce.tasks configuration properties (once your job has been submitted).

If you look through the source, you can see this being set in org.apache.hadoop.mapred.JobClient:784 (and yes it's the same number of splits)

// Create the splits for the job
LOG.debug("Creating splits at " + fs.makeQualified(submitSplitFile));
int maps;
if (job.getUseNewMapper()) {
  maps = writeNewSplits(context, submitSplitFile);
} else {
  maps = writeOldSplits(job, submitSplitFile);
}
job.set("mapred.job.split.file", submitSplitFile.toString());
job.setNumMapTasks(maps); // here is where mapred.map.tasks is set

Upvotes: 1

Related Questions