Reputation: 1271
Many places need SUBMISSION_ID
, like spark-submit --status
and Spark REST API. But how can I get this SUBMISSION_ID
when I use spark-submit
command to submit spark jobs?
P.S.:
I use python [popen][2]
to start a spark-submit
job. I want SUBMISSION_ID
so my python program can monitor spark job status via REST API: <ip>:6066/v1/submissions/status/<SUBMISSION_ID>
Upvotes: 1
Views: 5509
Reputation: 1271
Thanks to the clue by @Pandey. The answer https://stackoverflow.com/a/37980813/5634636 helps me a lot.
cluster
mode to submit your job,
i.e., use parameter --deploy-mode cluster
.NOTE: I only test my approaches on Apache Spark 2.3.1. I can't guarantee that it will work in other versions as well.
Let's clear my requirement first. There're 3 features I wanted:
NOTE: this answer only works in cluster mode
The Spark tool spark-submit
will help.
SubmissionID
. This answer https://stackoverflow.com/a/37980813/5634636 told you how to get a submission id in cluster mode. The submission id looks like driver-20190315142356-0004
.Spark submission API is recommended. It seems that there is not any documentation on Apache Spark official website, so some people call it hidden API. For details, see: https://www.nitendragautam.com/spark/submit-apache-spark-job-with-rest-api/
http://<master-ip>:6066/v1/submissions/status/<submission-id>
. The submission-id
will be returned in a json when you submit jobs.http://<driver-ip>:<ui-port>/log/<submission-id>
. Here is an example of error status (**** is an incorrect jar path which is miswritten intentionally):
{
"action" : "SubmissionStatusResponse",
"driverState" : "ERROR",
"message" : "Exception from the cluster:\njava.io.FileNotFoundException: File hdfs:**** does not exist.\n\torg.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:795)\n\torg.apache.hadoop.hdfs.DistributedFileSystem.access$700(DistributedFileSystem.java:106)\n\torg.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:853)\n\torg.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:849)\n\torg.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)\n\torg.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:860)\n\torg.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:727)\n\torg.apache.spark.util.Utils$.doFetchFile(Utils.scala:695)\n\torg.apache.spark.util.Utils$.fetchFile(Utils.scala:488)\n\torg.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155)\n\torg.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173)\n\torg.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92)",
"serverSparkVersion" : "2.3.1",
"submissionId" : "driver-20190315160943-0005",
"success" : true,
"workerHostPort" : "172.18.0.4:36962",
"workerId" : "worker-20190306214522-172.18.0.4-36962"
}
Upvotes: 2