Reputation: 15942

How do I get the success/failure status of a Hadoop job from the command line?

I'm using CDH4 with MRv1. From what I can tell, there is no command line tool for checking the "status" of a completed job. When I go to the web console job detail page, I can see "Status: Failed" or "Status: Succeeded". If I run mapred job -list all or mapred job -status job_201309231203_0011, neither indicate "Failed" or "Succeeded".

Am I missing some other command?

Upvotes: 0

Answers (3)

Vijay_Pansheriya

Reputation: 31

hadoop job -list all
hadoop job -status <JobID>

OR hadoop jobtracker web-dashboard would help you to find this error or job related information.

Upvotes: 0

Saurabh

Reputation: 7833

My hadoop version is 2.5.0.This works for me
first get job_id using

hadoop job -list

then do by getting job_id

hadoop job  -status {job_id}

Upvotes: 3

onlynone

Reputation: 8289

The fist couple lines of output from hadoop job -list all are:

X jobs submitted
States are:
        Running : 1     Succeded : 2    Failed : 3      Prep : 4
JobId   State   StartTime       UserName        Priority        SchedulingInfo

And the lines of output look like:

job_201309171413_38136  1       1382455374980   somebody        NORMAL  0 running map tasks using 0 map slots. 0 additional slots reserved. 1 running reduce tasks using 1 reduce slots. 0 additional slots reserved.
job_201309171413_37222  2       1382430339635   somebody        NORMAL  0 running map tasks using 0 map slots. 0 additional slots reserved. 0 running reduce tasks using 0 reduce slots. 0 additional slots reserved.

That second column is the State of the job. Based on the header lines, 1 means Running and 2 means Succeeded. It's not the clearest format: 4 lines of headers, needing to reference the header to figure out what the state codes actually mean, and no way of getting the state of just one job.

The easiest way to parse this output for a specific job is:

$ job_id=job_201309171413_38136
$ hadoop job -list all | awk -v job_id=${job_id} 'BEGIN{OFS="\t"; FS="\t"; final_state="Unknown"} $0 == "States are:" {getline; for(i=1;i<=NF;i++) { split($i,s," "); states[s[3]] = s[1] }} $1==job_id { final_state=states[$2]; exit} END{print final_state}'
Running

$ job_id=job_201309171413_37222
$ hadoop job -list all | awk -v job_id=${job_id} 'BEGIN{OFS="\t"; FS="\t"; final_state="Unknown"} $0 == "States are:" {getline; for(i=1;i<=NF;i++) { split($i,s," "); states[s[3]] = s[1] }} $1==job_id { final_state=states[$2]; exit} END{print final_state}'
Succeeded

$ job_id=foobar
$ hadoop job -list all | awk -v job_id=${job_id} 'BEGIN{OFS="\t"; FS="\t"; final_state="Unknown"} $0 == "States are:" {getline; for(i=1;i<=NF;i++) { split($i,s," "); states[s[3]] = s[1] }} $1==job_id { final_state=states[$2]; exit} END{print final_state}'
Unknown

Upvotes: 5

How do I get the success/failure status of a Hadoop job from the command line?

Answers (3)

Related Questions