Prasanth S
Prasanth S

Reputation: 41

Hadoop Yarn container logs missing

We usually will be able to see yarn container logs in "/var/log/hadoop-yarn/containers" path. Though I am able to see logs for successful jobs, I am not able to see the logs for failed jobs. The node manager logs shows the logs getting deleted.

Log:

2017-07-13 14:16:04,170 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor (DeletionService #1): Deleting path : /var/log/hadoop-yarn/containers/application_1234567890_12345/container_11234567890_12345_11_0000
01/stdout
2017-07-13 14:16:04,180 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl (LogAggregationService #6093): renaming /var/log/hadoop-yarn/apps/hadoop/logs/application_1234567890_12345/xx.xx.xx.xx_8041.tmp to /var/log/hadoop-yarn/apps/hadoop/logs/application_1234567890_12345/xx.xx.xx.xx_8041
2017-07-13 14:16:04,181 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor (DeletionService #3): Deleting path : /var/log/hadoop-yarn/containers/application_1234567890_12345
2017-07-13 14:16:06,048 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl (Container Monitor): Stopping resource-monitoring for container_11234567890_12345_11_0000

Here's a snippet of my yarn-site.xml.

Can some one please advise on what config needs to be modified to retain logs for failed jobs?

<property>
    <name>yarn.log-aggregation-enable</name>
    <value>true</value>
</property>

<property>
    <name>yarn.log.server.url</name>
    <value>http://ip-XX.XX.XX.XX:19888/jobhistory/logs</value>
</property>

<property>
    <name>yarn.nodemanager.local-dirs</name>
    <value>/mnt/yarn</value>
    <final>true</final>
</property>

<property>
    <description>Where to store container logs.</description>
    <name>yarn.nodemanager.log-dirs</name>
    <value>/var/log/hadoop-yarn/containers</value>
</property>

<property>
    <description>Where to aggregate logs to.</description>
    <name>yarn.nodemanager.remote-app-log-dir</name>
    <value>/var/log/hadoop-yarn/apps</value>
</property>


<property>
    <name>yarn.log-aggregation.enable-local-cleanup</name>
    <value>true</value>
</property>

<property>
    <name>yarn.scheduler.increment-allocation-mb</name>
    <value>32</value>
</property>

<property>
    <name>yarn.log-aggregation.retain-seconds</name>
    <value>604800</value>
</property>

Upvotes: 3

Views: 4700

Answers (1)

BjornO
BjornO

Reputation: 911

The logs get moved to HDFS when log aggregation is done, usually this is /app-logs on HDFS.

Check the below settings in the documentation

yarn.nodemanager.remote-app-log-dir Normally /app-logs on HDFS but in your case it is set to /var/log/hadoop-yarn/apps, does this directory exist on HDFS? Looks like a local directory value was put here by mistake.

Other settings that may be useful:

yarn.log-aggregation-enable: if ${yarn.log-aggregation-enable} is enabled then the NodeManager will immediately concatenate all of the containers logs into one file and upload them into HDFS in ${yarn.nodemanager.remote-app-log-dir}/${user.name}/logs/ and delete them from the local userlogs directory

yarn.nodemanager.delete.debug-delay-sec: Number of seconds after an application finishes before the nodemanager's DeletionService will delete the application's localized file directory and log directory. To diagnose Yarn application problems, set this property's value large enough (for example, to 600 = 10 minutes) to permit examination of these directories. After changing the property's value, you must restart the nodemanager in order for it to have an effect.

Upvotes: 2

Related Questions