Reputation: 805
I have deployed a new HDInsight 3.6 cluster with Spark 2.1 installed. I have previously used an HDInsight 3.5 cluster with Spark 1.6.
On this new cluster, I'm unable to access the executor logs from the Spark UI. Typically, in my previous (3.5 / 1.6) clusters, I have gone to the Executors
tab and then clicked on stderr
for the stdout logs from an individual executor.
What configuration could be causing this issue, or is there a workaround? I'm submitting a pyspark application if it makes a difference. Thanks!
Upvotes: 0
Views: 909
Reputation: 497
We had the same problem recently and we downloaded the logs to the filesystem using the following command through SSH:
yarn logs -applicationId <Application ID> -out <path_to_local_folder>
More information here.
Then we downloaded the logs from a VM using WinSCP and analysed them there.
Here's how it looks from WinSCP:
Download those files, maybe split them and you can analyse them.
Upvotes: 2
Reputation: 381
This may be a URL rewrite issue if you are accessing Spark UI without SSH tunnel. Try to refresh the page, or close/reopen browser window.
Alternatively, if the error persists, you can connect to the Spark UI using SSH tunnel. See this documentation on how to connect using SSH tunnel: https://learn.microsoft.com/en-us/azure/hdinsight/hdinsight-linux-ambari-ssh-tunnel
Upvotes: 0
Reputation: 12768
You can access the logs files from the application view, you can drill down further to find out the containers associated with the application and the logs (stdout/stderr). You can also launch the Spark UI by clicking the linking corresponding to the Tracking URL, as shown below.
Click the Executors tab to see processing and storage information for each executor. You can also retrieve the call stack by clicking on the Thread Dump link.
Upvotes: 0