ab3
ab3

Reputation: 333

spark-submit on yarn - multiple jobs

I would like to submit multiple spark-submit jobs with yarn. When I run

spark-submit --class myclass --master yarn --deploy-mode cluster blah blah

as it is now, I have to wait for the job to complete for me to submit more jobs. I see the heartbeat:

16/09/19 16:12:41 INFO yarn.Client: Application report for application_1474313490816_0015 (state: RUNNING) 16/09/19 16:12:42 INFO yarn.Client: Application report for application_1474313490816_0015 (state: RUNNING)

How can I tell yarn to pick up another job all from the same terminal. Ultimately I want to be able to run from a script where I cand send hundreds of jobs in one go.

Thank you.

Upvotes: 4

Views: 2930

Answers (2)

avrsanjay
avrsanjay

Reputation: 805

  • Check dynamic allocation in spark
  • Check what scheduler is in use with Yarn, if FIFO change it to FAIR
  • How are you planning to allocate resources to N number of jobs on yarn?

Upvotes: 1

Aman
Aman

Reputation: 9015

Every user has a fixed capacity as specified in the yarn configuration. If you are allocated N executors (usually, you will be allocated some fixed number of vcores), and you want to run 100 jobs, you will need to specify the allocation to each of the jobs:

spark-submit --num-executors N/100 --executor-cores 5

Otherwise, the jobs will loop in accepted.

You can launch multiple jobs in parallel using & at the last of every invocation.

for i inseq 20; do spark-submit --master yarn --num-executors N/100 --executor-cores 5 blah blah &; done

Upvotes: 3

Related Questions