Reputation: 1533
If I have 3 datanodes, I set the number of Reducer Tasks to 4, what happened in this case? The fourth one will be stand by until one of the datanode finishes its reducer task? Or two of them will be running in the same datanode at the same time?
Upvotes: 0
Views: 505
Reputation: 5063
Adding to Chaos's answer, if you have set number of reduce task to a number greater than that of the slots present for reduce tasks throughout the cluster, remaining reduce task will run whenever previous reduce slots gets unoccupied.
Upvotes: 3
Reputation: 11721
Reduce Tasks do not depend on Datanodes, they depend on the number of slots that are assigned to a particular node. The TaskTracker is responsible for running tasks on these slots on any node in the cluster. You can have more than 1 slot per node, so you can have more than 1 Reduce tasks running per node.
Upvotes: 2