Reputation: 8281
I'm using EMR 5.4 and I submit spark job to Yarn
When I try to retrive the log with yarn logs -applicationId application_1528461193301_0001
, I have the following error :
18/06/08 12:38:01 INFO client.RMProxy: Connecting to ResourceManager at ip-10-0-182-144.eu-west-1.compute.internal/10.0.182.144:8032
s3://xxx/apps/root/logs/application_1528461193301_0001 does not exist.
Log aggregation has not completed or is not enabled.
Here is my config /etc/hadoop/conf/yarn-site.xml
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<description>Where to store container logs.</description>
<name>yarn.nodemanager.log-dirs</name>
<value>s3://xxx/containers</value>
</property>
<property>
<description>Where to aggregate logs to.</description>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>s3://xxx/apps</value>
</property>
Upvotes: 3
Views: 3027
Reputation: 707
As per documentation yarn logs utility can not be used if logs are aggregated to s3
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-debugging.html
Note
You cannot currently use log aggregation to Amazon S3 with the yarn logs utility.
you can downlod log files using aws utility
aws s3 cp s3://xxx/apps/[applicationId] /your/folder --recursive
Upvotes: 1