hard coder
hard coder

Reputation: 5705

How Can I submit multiple jobs in Spark Standalone cluster?

I have a Machine with Apache Spark. Machine is 64GB RAM 16 Cores.

My Objective in each spark job

1. Download a gz file from a remote server
2. Extract gz to get csv file (1GB max)
3. Process csv file in spark and save some stats.

Currently I am submitting one job for each file received by doing following

./spark-submit --class ClassName --executor-cores 14 --num-executors 3 --driver-memory 4g --executor-memory 4g jar_path

And wait for this job to complete and then start new job for new file.

Now I want to utilise 64GB RAM by running multiple jobs in parallel.

I can assign 4g RAM to each job and want to queue my jobs when there are enough jobs already running.

How Can I achieve this?

Upvotes: 1

Views: 2167

Answers (1)

KZapagol
KZapagol

Reputation: 928

You should submit multiple jobs from different threads:

https://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application

and configure pool properties (set schedulingMode to FAIR):

https://spark.apache.org/docs/latest/job-scheduling.html#configuring-pool-properties

From Spark Doc:

https://spark.apache.org/docs/latest/spark-standalone.html#resource-scheduling:

The standalone cluster mode currently only supports a simple FIFO scheduler across applications. However, to allow multiple concurrent users, you can control the maximum number of resources each application will use. By default, it will acquire all cores in the cluster, which only makes sense if you just run one application at a time. You can cap the number of cores by setting spark.cores.max ...

By default, it utilise all the resources for one single job.We need to define the resources so that their will be space to run other job as well.Below is the command you can use to submit spark job.

bin/spark-submit --class classname --master spark://hjvm1:6066 --deploy-mode cluster  --driver-memory 500M --conf spark.executor.memory=1g --conf spark.cores.max=1 /data/test.jar

Upvotes: 3

Related Questions