Reputation: 333
I would like to submit multiple spark-submit jobs with yarn. When I run
spark-submit --class myclass --master yarn --deploy-mode cluster blah blah
as it is now, I have to wait for the job to complete for me to submit more jobs. I see the heartbeat:
16/09/19 16:12:41 INFO yarn.Client: Application report for application_1474313490816_0015 (state: RUNNING)
16/09/19 16:12:42 INFO yarn.Client: Application report for application_1474313490816_0015 (state: RUNNING)
How can I tell yarn to pick up another job all from the same terminal. Ultimately I want to be able to run from a script where I cand send hundreds of jobs in one go.
Thank you.
Upvotes: 4
Views: 2930
Reputation: 805
Upvotes: 1
Reputation: 9015
Every user has a fixed capacity as specified in the yarn configuration. If you are allocated N executors (usually, you will be allocated some fixed number of vcores
), and you want to run 100 jobs, you will need to specify the allocation to each of the jobs:
spark-submit --num-executors N/100 --executor-cores 5
Otherwise, the jobs will loop in accepted.
You can launch multiple jobs in parallel using &
at the last of every invocation.
for i in
seq 20; do spark-submit --master yarn --num-executors N/100 --executor-cores 5 blah blah &; done
Upvotes: 3