Reputation: 53
Is it possible to set number of map task running per node.
I'm using Hadoop Streaming for crawling data, and I need only one map task per node to avoid blocks.
Thanks,
Upvotes: 2
Views: 1983
Reputation: 11
Have you tried playing with the following settings in your job.xml?
mapred.max.maps.per.node=1
mapred.max.reduces.per.node=1
These are default to -1, unlimited (except of course, by available slots).
Upvotes: 1
Reputation: 33495
Irrespective of Streaming or not, the maximum # of mappers per node can be set using the mapreduce.tasktracker.map.tasks.maximum
parameters. The parameter has to be set in the mapred-site.xml
file on the node, this property has no effect when set on the client.
Upvotes: 3