How does Logging in Hadoop Jobs Work?

Question

How does logging in a Hadoop job work? Using SLF4J and Logback, what sort of configuration would I need to see all the logging output in one place? Does STDOUT for a Hadoop job get collated by the JobTracker?

Joe23 · Accepted Answer

The log directory on each datanode contains a sub-directory userlogs. This contains sub-diretories for recent map-task attempts. That is for each instance of a map task. Since the task attempt contains the job-id in its name you can find out what logs where created by a specific job.

The task attempt directories contain the files:

stderr
stdout
syslog

These contain the respective outputs.

You can access task logs from the JobTracker Web-GUI by navigating from a listed Job to its tasks, clicking on a task and selecting its output.

How does Logging in Hadoop Jobs Work?

Answers (1)

Related Questions