Reputation: 2652
Currently I am using the default Yarn scheduler but would like to do something like -
Run Yarn using the default scheduler
If (number of jobs in queue > X) {
Change the Yarn scheduler to FIFO
}
Is this even possible through code?
Note that I am running Spark jobs on an aws EMR cluster with Yarn as RM.
Upvotes: 0
Views: 925
Reputation: 2343
Well, it can be possible by having a poller checking current queue(using RM API) and updating yarn-site.xml + probable restart of RM. However, restarting RM can impact your queue because the current jobs will be Killed or Shutdown(and probably retried later).
If you need a more efficient switch between Capacity and FIFO scheduler's , you might as well need to extend those classes and design your own Scheduler which can do the job of your pseudo code.
EMR by default uses capacity scheduler with DefaultResourceCalculator and spins up jobs on Default queue. For example , EMR has yarn configurations on a paths like the following:
/home/hadoop/.versions/2.4.0-amzn-6/etc/hadoop/yarn-site.xml
<property><name>yarn.resourcemanager.scheduler.class</name><value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value></property>
with
/home/hadoop/.versions/2.4.0-amzn-6/etc/hadoop/capacity-scheduler.xml
org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator
Upvotes: 1