user3606212
user3606212

Reputation: 83

How many jvm is launched per node in a hadoop cluster if mapred.job.reuse.jvm.num.tasks is set to -1

I recently saw mapred.job.reuse.jvm.num.tasks property of hadoop. By default it is set to +1 which means a new JVM is launched per map/reduce task. On the contrary, if it is set to -1 then a jvm can be used by unlimited number of tasks. In this case tasks executes serially one after other in order to use the same JVM.

So, when the property is set to +1 number of JVM launched per node equals the number of task. There is no confusion.... But, my specific question is, how many JVM is launched per node if I set mapred.job.reuse.jvm.num.tasks to -1. Is it only ONE JVM per node? or something else?

Upvotes: 2

Views: 3154

Answers (1)

dpsdce
dpsdce

Reputation: 5450

mapred.job.reuse.jvm.num.tasks property is used to set the maximum number of tasks for a single job , which will be executed in the single JVM. Default value is 1.

Tasks from different jobs will always run in separate JVM. this is only for single Job’s task.

so when you set this to -1 ,this indicates all the tasks for a job will run same JVM on a node however a new Job will spawn a new jvm to run its tasks on the node.

Hadoop will typically launch map or reduce tasks in a forked JVM. the JVM startup may create significant overhead, especially when launching jobs with hundreds or thousands of tasks, most which have short execution times. Reuse allows a JVM instance to be reused up to N times for the same job.

  <property>
  <name>mapred.job.reuse.jvm.num.tasks</name>
  <value>10</value>
  <description>How many tasks to run per jvm. If set to -1, there is no limit</description>
  </property>

So this configuration will reuse a JVM instance 10 times for one particular job.

Upvotes: 1

Related Questions