Yaser
Yaser

Reputation: 29

How to increase the number of running containers of spark application using EMR cluster?

I am using EMR cluster of 1 master and 11 m5.2xlarge core nodes. After doing some related calculations to this type of node, the following json to set my spark application configuration on EMR:

[
    {
        "Classification": "capacity-scheduler",
        "Properties": {
            "yarn.scheduler.capacity.resource-calculator":"org.apache.hadoop.yarn.util.resource.DominantResourceCalculator"
        }
    },

 {
        "Classification": "yarn-site",
        "Properties": {
           "yarn.nodemanager.vmem-check-enabled":"false",
           "yarn.nodemanager.pmem-check-enabled":"false"
                 }
               },
    {
        "Classification": "spark-defaults",
        "Properties": {
            "spark.dynamicAllocation.enabled":"false",
        "spark.worker.instances":"5",
            "spark.driver.memory":"20g",
            "spark.executor.memory":"20g",
            "spark.executor.cores":"5",
            "spark.driver.cores":"5",
            "spark.executor.instances":"14",
            "spark.yarn.executor.memoryOverhead":"4g",
            "spark.default.parallelism":"140"
        }
    },
  {
    "classification": "spark",
    "properties": {
      "maximizeResourceAllocation":"false"
    }
  }
]

However, the running containers of this cluster are not as i expected (usually the same number of running cores). Just 11 running contaiers there are, how can i increase this number to be 51 as the number of used Vcores?

Upvotes: 1

Views: 2381

Answers (1)

Dave
Dave

Reputation: 2049

The instance type m5.2xlarge has 8 vCPUs and 32G RAM. You could do 4 executors per node at 2 vCPUs and 7G per executor, for a total of 44 executors. This would leave you 4G overhead on each worker node, which should be plenty.

Your spark-defaults config should be thus:

    {
        "Classification": "spark-defaults",
        "Properties": {
            "spark.dynamicAllocation.enabled":"false",

            "spark.executor.instances":"44",
            "spark.executor.cores":"2",
            "spark.executor.memory":"7g"
        }
    },

Upvotes: 3

Related Questions