marcorossi
marcorossi

Reputation: 2011

Hadoop: force 1 mapper task per node from jobconf

I want to run one task (mapper) per node on my Hadoop cluster, but I cannot modify the configuration with which the tasktracker runs (i'm just a user).
For this reason, I need to be able to push the option through the job configuration. I tried to set the mapred.tasktracker.map.tasks.maximum=1 at hadoop jar command, but the tasktracker ignores it as it has a different setting in its configuration file.
By the way, the cluster uses the Capacity Scheduler.

Is there any way I can force 1 task per node?

Edited: Why? I have a memory-bound task, so I want each task to use all the memory available to the node.

Upvotes: 0

Views: 799

Answers (1)

Tariq
Tariq

Reputation: 34184

when you set the no of mappers, either through the configuration files or by some other means, it's just a hint to the framework. it doesn't guarantee that you'll get only the specified no of mappers. the creation of mappers is actually governed by the no of Splits. and the split creation is carried out by the logic which your InputFormat holds. if you really want to have just one mapper to process the entire file, set "issplittable" to true in the InputFormat class you are using. but why would you do that?the power of hadoop actually lies in distributed parallel processing.

Upvotes: 1

Related Questions