Reputation: 791
To check running applications in Apache spark, one can check them from the web interface on the URL:
http://<master>:8080
My question how we can check running applications from terminal, is there any command that returns applications status?
Upvotes: 36
Views: 88476
Reputation: 360
I have found that it is possible to use REST API to submit, kill and get status of Spark jobs. The REST API is exposed on master on port 6066.
To create the job, use the following curl command:
curl -X POST http://spark-cluster-ip:6066/v1/submissions/create
--header "Content-Type:application/json;charset=UTF-8"
--data
'{
"action" : "CreateSubmissionRequest",
"appArgs" : [ "blah" ],
"appResource" : "path-to-jar-file",
"clientSparkVersion" : "2.2.0",
"environmentVariables" : { "SPARK_ENV_LOADED" : "1" },
"mainClass" : "app-class",
"sparkProperties" : {
"spark.jars" : "path-to-jar-file",
"spark.driver.supervise" : "false",
"spark.app.name" : "app-name",
"spark.submit.deployMode" : "cluster",
"spark.master" : "spark://spark-master-ip:6066"
}
}'
The response includes success or failure of the above operation and submissionId
{
'submissionId': 'driver-20170829014216-0001',
'serverSparkVersion': '2.2.0',
'success': True,
'message': 'Driver successfully submitted as driver-20170829014216-0001',
'action': 'CreateSubmissionResponse'
}
To delete the job, use the submissionId obtained above:
curl -X POST http://spark-cluster-ip:6066/v1/submissions/kill/driver-driver-20170829014216-0001
The response again contains success/failure status:
{
'success': True,
'message': 'Kill request for driver-20170829014216-0001 submitted',
'action': 'KillSubmissionResponse',
'serverSparkVersion': '2.2.0',
'submissionId': 'driver-20170829014216-0001'
}
To get the status, use the following command:
curl http://spark-cluster-ip:6066/v1/submissions/status/driver-20170829014216-0001
The response includes driver state -- current status of the app:
{
"action" : "SubmissionStatusResponse",
"driverState" : "RUNNING",
"serverSparkVersion" : "2.2.0",
"submissionId" : "driver-20170829203736-0004",
"success" : true,
"workerHostPort" : "10.32.1.18:38317",
"workerId" : "worker-20170829013941-10.32.1.18-38317"
}
I found out about REST API here.
Upvotes: 3
Reputation: 3134
As in my case, my spark application runs on Amazon's AWS EMR remotely. So I make use of Lynx command line browser to access the spark application's status. While you have submitted your spark job from one terminal, open another terminal and fire the following command from new terminal.
**lynx http://localhost:<4043 or other spark job port>**
Upvotes: 2
Reputation: 74789
If it's for Spark Standalone or Apache Mesos cluster managers, @sb0709's answer is the way to follow.
For YARN, you should use yarn application command:
$ yarn application -help
usage: application
-appStates <States> Works with -list to filter applications
based on input comma-separated list of
application states. The valid application
state can be one of the following:
ALL,NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUN
NING,FINISHED,FAILED,KILLED
-appTypes <Types> Works with -list to filter applications
based on input comma-separated list of
application types.
-help Displays help for all commands.
-kill <Application ID> Kills the application.
-list List applications. Supports optional use
of -appTypes to filter applications based
on application type, and -appStates to
filter applications based on application
state.
-movetoqueue <Application ID> Moves the application to a different
queue.
-queue <Queue Name> Works with the movetoqueue command to
specify which queue to move an
application to.
-status <Application ID> Prints the status of the application.
Upvotes: 21
Reputation: 2500
You can use spark-submit --status
(as described in Mastering Apache Spark 2.0).
spark-submit --status [submission ID]
See the code of spark-submit for reference:
if (!master.startsWith("spark://") && !master.startsWith("mesos://")) {
SparkSubmit.printErrorAndExit(
"Requesting submission statuses is only supported in standalone or Mesos mode!")
}
Upvotes: 11