avhacker
avhacker

Reputation: 677

Clean up failed maps

My mapper will write some data to local disks and clean it up when mapper finishes. However, the cleanup() method won't be called if error occurs (exception happens). I can catch exception inside my mapper but I can't handle the exception which is not invoked in my mapper ( Ex: Job tracker failover to standby node).

Is there any way that I can cleanup when the mapper get fails?

Upvotes: 0

Views: 326

Answers (2)

DDW
DDW

Reputation: 2015

Using the job class you can definitely delete some folders if the job finishes, even if the directories are in the local filesystem, use the FileSystem class

More on filesystems in hadoop

Upvotes: 0

Chris White
Chris White

Reputation: 30089

You can override the run method of mapper to include a try / catch around the iteration of input keys from the context and ensure that cleanup is called:

@Override
public void run() {
  setup(context);

  try {
    while (context.nextKeyValue()) {
      map(context.getCurrentKey(), context.getCurrentValue(), context);
    }
  } finally {
    cleanup(context);
  }
}

You'll need to make sure that your cleanup method doesn't have any logic in it to try and output records, or set a flag in your mapper to denote that an error occurred.

This may not protect against all types of task failure (JVM crash for example), for which i don't think you have any other method, other than to maybe run a job after the original job whose role is to ensure the resources used are properly cleaned up.

Upvotes: 2

Related Questions