Reputation: 1006
I'd like to retrieve the status of a spark job running in cluster mode on a mesos master via the following:
spark-submit --master mesos://<ip>:7077 --status "driver-...-..."
It exits 0
with no logging, no matter what the driver's status is.
I know that it's doing something right, since if I run the command with a an invalid mesos ip/port, I get
Exception in thread "main" org.apache.spark.deploy.rest.SubmitRestConnectionException: Unable to connect to server
at org.apache.spark.deploy.rest.RestSubmissionClient$$anonfun$requestSubmissionStatus$3.apply(RestSubmissionClient.scala:165)
and if I run with an invalid submission id, I get
2018-10-02 18:47:01 ERROR RestSubmissionClient:70 - Error: Server responded with message of unexpected type SubmissionStatusResponse.
Any idea why spark-submit --status
isn't returning anything?
Upvotes: 2
Views: 867
Reputation: 9
Add following log4j.logger.org.apache.spark.deploy.rest.RestSubmissionClient=INFO and log4j.logger.org.apache.spark.deploy.rest=INFO
to log4j.properties present under /etc/spark/conf location and again look for status
spark-submit --master spark://:6066 --status driver-20210516043704-0012
Upvotes: 0
Reputation: 1570
Not sure what version of spark you are using. My investigation is based on spark-2.4.0. The described behaviour is valid for both spark standalone and mesos deployment targets.
org.apache.spark.deploy.rest.RestSubmissionClient
is used as the handler for rest submission requests and programmatically uses INFO level to log the response.
org.apache.spark.deploy.SparkSubmit
is used as a main class when invoking spark-submit and its logger is the top level root logger for all other loggers.
Programatically, if specific logger for SparkSubmit is not set in conf/log4j.properties (the same holds when this file is absent) the default level is set to WARN.
Going further, in the absence of the specific logger for RestSubmissionClient it gets its root logger's level which is SparkSubmit's logger.
You can see errors because again WARN is default.
To be able to see the logs for rest submissions you may want to adjust ${SPARK_HOME}/conf/log4j.properties with either
log4j.logger.org.apache.spark.deploy.rest.RestSubmissionClient=INFO
or log4j.logger.org.apache.spark.deploy.rest=INFO
for other classes in that package.
Upvotes: 0
Reputation: 1006
I found a workaround by accessing the dispatcher's api directly:
curl -s "http://$DISPATCHER/v1/submissions/status/$SUBMISSION_ID"
Still no clear answer why spark-submit --status
does not behave as documented though.
Upvotes: 2