Reputation: 895
In a hadoop Map-Reduce framework, when an application is running, is it possible to know the number of workers running in the program. The number of workers is same as the number of file splits so in other words, is it possible to know the number of file splits dynamically?
Upvotes: 0
Views: 205
Reputation: 30089
The total number of map tasks and reducer tasks that make up the job can be queried via the mapred.map.tasks
and mared.reduce.tasks
configuration properties (once your job has been submitted).
If you look through the source, you can see this being set in org.apache.hadoop.mapred.JobClient:784
(and yes it's the same number of splits)
// Create the splits for the job
LOG.debug("Creating splits at " + fs.makeQualified(submitSplitFile));
int maps;
if (job.getUseNewMapper()) {
maps = writeNewSplits(context, submitSplitFile);
} else {
maps = writeOldSplits(job, submitSplitFile);
}
job.set("mapred.job.split.file", submitSplitFile.toString());
job.setNumMapTasks(maps); // here is where mapred.map.tasks is set
Upvotes: 1