Reputation: 165
There is a image of the job's running status.
As you can see, the map task have already finished and the reduce task have run to 90.74%. According to the MapReduce model, it should't happen like that. How could that happen? And why? Is there any way to fix it?
My environment:
Upvotes: 0
Views: 188
Reputation: 30089
This can occur if a reducer or number of reducers fail to pull the temp map outputs from a given map task or task tracker (say the task tracker that ran map task 00001 fails for some reason.
In this case Hadoop will re-run the map task(s) on another node. In your case it's more complicated that that - it looks like you have a blacklisted Task Tracker and a number of failed reducer tasks.
My guess in this situation is that during the process of running the reducer phase, all the reducer tasks running on a single task tracker failed - causing the task tracker to be blacklisted. In this case any map tasks run on that node will need to be rescheduled to run again on another task tracker - hence the 5 pending Map tasks.
As to how to fix this - this is a error handling case built into Hadoop. You should check the logs for the failed map task and failed reducer tasks for clues - as it could be a number of issues (disk space, max http thread count for the task tracker, memory requirements for the reducer implementation, a bug in your ser-de methods for custom writables, ...).
Feel free to investigate and post stack traces back into your original question (or a new question)
Upvotes: 1