Reputation: 11619
I forgot what API/method to call, but my problem is that :
My mapper will run more than 10 minutes - and I don't want to increase default timeout.
Rather I want to have my mapper send out update ping to task tracker, when it is in the particular code path that consumes time > 10 mins.
Please let me know what API/method to call.
Upvotes: 2
Views: 1278
Reputation: 20969
You can simply increase a counter and call progress
. This will ensure that the task sends a heartbeat back to the tasktracker to know if its alive.
In the new API this is managed through the context, see here: http://hadoop.apache.org/common/docs/r1.0.0/api/index.html
e.G.
@Override
protected void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
// increment counter
context.getCounter(SOME_ENUM).increment(1);
context.progress();
}
In the old API there is the reporter class: http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/Reporter.html
Upvotes: 6
Reputation: 23373
You typically use the Reporter to let the framework know you're still alive.
Quote from the javadoc:
Mapper and Reducer can use the Reporter provided to report progress or just indicate that they are alive. In scenarios where the application takes an insignificant amount of time to process individual key/value pairs, this is crucial since the framework might assume that the task has timed-out and kill that task.
Upvotes: 1