cosmos
cosmos

Reputation: 2424

Resume Hadoop Jobs workflow

In my application, I have a series of 5 Hadoop Jobs which are chained together sequentially using

Job.waitForCompletion(false)

Now, the Hadoop docs clearly states

...the onus on ensuring jobs are complete 
(success/failure) lies squarely on the clients

Now, if my job client program crashes, how do I ensure that the job client program can resume at the point of crash when it is restarted ? Is there any way to query the JobTracker and get a handle to a specific job and thereafter check its job status ?

Upvotes: 0

Views: 1057

Answers (1)

Ash
Ash

Reputation: 16

Following approach can be tried out when client itself crashes:

Hadoop provides JobClient which can be used for tracking current running jobs in cluster. So when client restarts following methods of JobClient can be used:

  • jobsToComplete() -Get the jobs that are not completed and not failed
  • jobsToComplete() -Get the jobs that are not completed and not failed
  • getAllJobs()- Get the jobs that are submitted.
  • getClusterStatus() - Get status information about the Map-Reduce cluster.
  • submitJob(JobConf job)- Submit a job to the MR system,if it has failed.

Upvotes: 0

Related Questions