Reputation: 55
I need to calculate the progress of each map task running on all nodes in a Hadoop cluster. I was thinking of dividing the size of the processed data by the size of the whole input data, but I am not sure how to get this information for a task.
I see that TaskStatus
class has a method getProgress()
, but there is no description for it. Does it provide the value that I need?
Upvotes: 1
Views: 1228
Reputation: 19867
For a map task, yes getProgress()
returns how far the mapper has progressed through the input file. For reduce tasks, the calculation is less straightforward. This article has a pretty good explanation.
Upvotes: 2