Arun K
Arun K

Reputation: 121

Understanding Hadoop Simulator Mumak

Recently I was trying to understand the working of Mumak (see, e.g., MAPREDUCE-728)

It basically takes a job trace and topology trace and simulates hadoop. I couldn't understand how it assigns splits across nodes. What does mumak mean by local map task and non-local task?

Upvotes: 1

Views: 1632

Answers (1)

Tader
Tader

Reputation: 26902

In MapReduce there is the notion of "locality" which signifies how "far away" a task is running from the data it is working on. The best locality is running a task on a node that contains the data it needs. The second best locality is a node in the same rack as a node containing the data, etc...

Mumak has the ability to slow-down the tasks scheduled on non-local nodes by using the following settings in your configuration file:

<property>
    <name>mumak.scale.racklocal</name>
    <value>1.5</value>
    <description>Scaling factor for task attempt runtime of rack-local over
    node-local</description>
</property>

<property>
    <name>mumak.scale.rackremote</name>
    <value>1.8</value>
    <description>Scaling factor for task attempt runtime of rack-remote over
    node-local</description>
</property>

Upvotes: 1

Related Questions