Avinash L
Avinash L

Reputation: 187

How to avoid from failing Map/Reduce task in Hadoop

I have a loop with too many iterations and a function which is computation heavy in Reducer function.

while (context.getCounter(SOLUTION_FLAG.SOLUTION_FOUND).getValue() < 1 && itrCnt < MAX_ITR)

MAX_ITR is iterations count - user input

The problem is when I run it on Hadoop cluster there is timeout error and Reducer task is killed

17/05/06 21:09:43 INFO mapreduce.Job: Task Id : attempt_1494129392154_0001_r_000000_0, Status : FAILED
AttemptID:attempt_1494129392154_0001_r_000000_0 Timed out after 600 secs

What should I do to avoid timeout? (My guess is heartbeat signals.)

Upvotes: 0

Views: 2297

Answers (1)

Sandeep Singh
Sandeep Singh

Reputation: 7990

The reason for the timeouts might be a long-running computation in reducer without reporting the job progress ststus back to the Hadoop framework. You can try increasing the timeout interval from default 600 sec using below command.

mapred.task.timeout=1800000

Here is more reference on this.

If these settings doesn't works then consider rechecking the code. There could be an issue with code logic too.

Upvotes: 1

Related Questions