Reputation: 98
Following the instructions in this website, I'm trying to submit a job to Spark via REST API /v1/submissions
.
I tried to submit SparkPi
in the example:
$ ./create.sh
{
"action" : "CreateSubmissionResponse",
"message" : "Driver successfully submitted as driver-20211212044718-0003",
"serverSparkVersion" : "3.1.2",
"submissionId" : "driver-20211212044718-0003",
"success" : true
}
$ ./status.sh driver-20211212044718-0003
{
"action" : "SubmissionStatusResponse",
"driverState" : "SUBMITTED",
"serverSparkVersion" : "3.1.2",
"submissionId" : "driver-20211212044718-0003",
"success" : true
}
create.sh
:
curl -X POST http://172.17.197.143:6066/v1/submissions/create --header "Content-Type:application/json;charset=UTF-8" --data '{
"appResource": "/home/ruc/spark-3.1.2/examples/jars/spark-examples_2.12-3.1.2.jar",
"sparkProperties": {
"spark.master": "spark://172.17.197.143:7077",
"spark.driver.memory": "1g",
"spark.driver.cores": "1",
"spark.app.name": "REST API - PI",
"spark.jars": "/home/ruc/spark-3.1.2/examples/jars/spark-examples_2.12-3.1.2.jar",
"spark.driver.supervise": "true"
},
"clientSparkVersion": "3.1.2",
"mainClass": "org.apache.spark.examples.SparkPi",
"action": "CreateSubmissionRequest",
"environmentVariables": {
"SPARK_ENV_LOADED": "1"
},
"appArgs": [
"400"
]
}'
status.sh
:
export DRIVER_ID=$1
curl http://172.17.197.143:6066/v1/submissions/status/$DRIVER_ID
But when I try to get the status of the job (even after a few minutes), I got a "SUBMITTED" rather than "RUNNING" or "FINISHED".
Then I looked up the log and found that
21/12/12 04:47:18 INFO master.Master: Driver submitted org.apache.spark.deploy.worker.DriverWrapper
21/12/12 04:47:18 WARN master.Master: Driver driver-20211212044718-0003 requires more resource than any of Workers could have.
# ...
21/12/12 04:49:02 WARN master.Master: Driver driver-20211212044718-0003 requires more resource than any of Workers could have.
However, in my spark-env.sh
, I have
export SPARK_WORKER_MEMORY=10g
export SPARK_WORKER_CORES=2
I have no idea what happened. How can I make it run normally?
Upvotes: 0
Views: 1400
Reputation: 757
Since you've checked resources and You have enough. It might be network issue. executor maybe cannot connect back to driver program. Allow traffic on both master and workers.
Upvotes: 1