MihaelaO
MihaelaO

Reputation: 55

Hadoop Task progress

I need to calculate the progress of each map task running on all nodes in a Hadoop cluster. I was thinking of dividing the size of the processed data by the size of the whole input data, but I am not sure how to get this information for a task.

I see that TaskStatus class has a method getProgress(), but there is no description for it. Does it provide the value that I need?

Upvotes: 1

Views: 1228

Answers (1)

highlycaffeinated
highlycaffeinated

Reputation: 19867

For a map task, yes getProgress() returns how far the mapper has progressed through the input file. For reduce tasks, the calculation is less straightforward. This article has a pretty good explanation.

Upvotes: 2

Related Questions