mwlon
mwlon

Reputation: 1006

spark-submit --status with mesos master returns nothing

I'd like to retrieve the status of a spark job running in cluster mode on a mesos master via the following:

spark-submit --master mesos://<ip>:7077 --status "driver-...-..."

It exits 0 with no logging, no matter what the driver's status is.

I know that it's doing something right, since if I run the command with a an invalid mesos ip/port, I get

Exception in thread "main" org.apache.spark.deploy.rest.SubmitRestConnectionException: Unable to connect to server
at org.apache.spark.deploy.rest.RestSubmissionClient$$anonfun$requestSubmissionStatus$3.apply(RestSubmissionClient.scala:165)

and if I run with an invalid submission id, I get

2018-10-02 18:47:01 ERROR RestSubmissionClient:70 - Error: Server responded with message of unexpected type SubmissionStatusResponse.

Any idea why spark-submit --status isn't returning anything?

Upvotes: 2

Views: 867

Answers (3)

chetan007
chetan007

Reputation: 9

Add following log4j.logger.org.apache.spark.deploy.rest.RestSubmissionClient=INFO and log4j.logger.org.apache.spark.deploy.rest=INFO

to log4j.properties present under /etc/spark/conf location and again look for status

spark-submit --master spark://:6066 --status driver-20210516043704-0012

Upvotes: 0

kasur
kasur

Reputation: 1570

Not sure what version of spark you are using. My investigation is based on spark-2.4.0. The described behaviour is valid for both spark standalone and mesos deployment targets.

org.apache.spark.deploy.rest.RestSubmissionClient is used as the handler for rest submission requests and programmatically uses INFO level to log the response.

org.apache.spark.deploy.SparkSubmit is used as a main class when invoking spark-submit and its logger is the top level root logger for all other loggers.

Programatically, if specific logger for SparkSubmit is not set in conf/log4j.properties (the same holds when this file is absent) the default level is set to WARN.

Going further, in the absence of the specific logger for RestSubmissionClient it gets its root logger's level which is SparkSubmit's logger.

You can see errors because again WARN is default.

To be able to see the logs for rest submissions you may want to adjust ${SPARK_HOME}/conf/log4j.properties with either log4j.logger.org.apache.spark.deploy.rest.RestSubmissionClient=INFO or log4j.logger.org.apache.spark.deploy.rest=INFO for other classes in that package.

Upvotes: 0

mwlon
mwlon

Reputation: 1006

I found a workaround by accessing the dispatcher's api directly:

curl -s "http://$DISPATCHER/v1/submissions/status/$SUBMISSION_ID"

Still no clear answer why spark-submit --status does not behave as documented though.

Upvotes: 2

Related Questions