python_sre
python_sre

Reputation: 21

In hadoop,what is the difference and relationship between jobtracker tasktracker?

as the title says,In hadoop,what is the difference and relationship between jobtracker tasktracker? Can someone explain to me ,thanks for your kind help!

Upvotes: 2

Views: 4803

Answers (1)

andani
andani

Reputation: 424

Job Tracker –

  • JobTracker process runs on a separate node and not usually on a DataNode.
  • JobTracker is an essential Daemon for MapReduce execution in MRv1. It is replaced by ResourceManager/ApplicationMaster in MRv2.
  • JobTracker receives the requests for MapReduce execution from the client.
  • JobTracker talks to the NameNode to determine the location of the data.
  • JobTracker finds the best TaskTracker nodes to execute tasks based on the data locality (proximity of the data) and the available slots to execute a task on a given node.
  • JobTracker monitors the individual TaskTrackers and the submits back the overall status of the job back to the client.
  • JobTracker process is critical to the Hadoop cluster in terms of MapReduce execution.
  • When the JobTracker is down, HDFS will still be functional but the MapReduce execution can not be started and the existing MapReduce jobs will be halted.

Task Tracker-

  • TaskTracker runs on DataNode. Mostly on all DataNodes.

  • TaskTracker is replaced by Node Manager in MRv2.

  • TaskTracker will be in constant communication with the JobTracker signalling the progress of the task in execution.

  • Mapper and Reducer tasks are executed on DataNodes administered by TaskTrackers.

  • TaskTrackers will be assigned Mapper and Reducer tasks to execute by JobTracker.

  • TaskTracker failure is not considered fatal. When a TaskTracker becomes unresponsive, JobTracker will assign the task executed by the TaskTracker to another node.

Upvotes: 3

Related Questions