daydreamer
daydreamer

Reputation: 92179

Hadoop - increasing map tasks in xml doesn't increases map tasks when runs

I added the following in my conf/mapred-site.xml

<property>
  <name>mapred.tasktracker.map.tasks.maximum</name>
  <value>4</value>
</property>

<property>
  <name>mapred.tasktracker.reduce.tasks.maximum</name>
  <value>1</value>
</property>

But when I run the job, its still runs 2 maps(which is default one)? How can I force this number to increase?

P.S. I am using Ubuntu Quad core box

Thank you

Upvotes: 5

Views: 4399

Answers (3)

saiyan
saiyan

Reputation: 611

mapred.tasktracker.map.tasks.maximum is the maximum number of tasks a tasktracker can run simultaneously. But when you want to set the number of map tasks for a job as a whole, set mapred.map.tasks to 4.

Upvotes: 2

QuinnG
QuinnG

Reputation: 6404

<property>
  <name>mapred.tasktracker.map.tasks.maximum</name>
  <value>1</value>
  <final>true</final>
</property>

Try that.

Upvotes: 0

Donald Miner
Donald Miner

Reputation: 39943

Are you running over a small amount of data? It could be that your MapReduce job is running over only one input split and thus does not require more mappers. Try running your job over hundreds of MB of data instead and see if you still have the same issue.

The maximum number of tasks able to run on a single node has nothing to do with the number of map tasks a job has. Your job could be 20 map tasks, while your cluster has 5 map slots, and it will just take longer. Or, your cluster could have 50 map slots, but your job only have 2 map slots.

Upvotes: 4

Related Questions