Reputation: 20410
I launch a Python Spark program like this:
/usr/lib/spark/bin/spark-submit \
--master yarn \
--executor-memory 2g \
--driver-memory 2g \
--num-executors 2 --executor-cores 4 \
my_spark_program.py
I get the error:
Required executor memory (2048+4096 MB) is above the max threshold (5760 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.
This is a brand new EMR 5 cluster with one master m3.2xlarge systems and two core m3.xlarge systems. Everything should be set to defaults. I am currently the only user running only one job on this cluster.
If I lower executor-memory from 2g to 1500m, it works. This seems awfully low. An EC2 m3.xlarge server has 15GB of RAM. These are Spark worker/executor machines, they have no other purpose, so I would like to use as much of that as possible for Spark.
Can someone explain how I go from having an EC2 worker instance with 15GB to being able to assign a Spark worker only 1.5GB?
On [http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/TaskConfiguration_H2.html] I see that the EC2 m3.xlarge default for yarn.nodemanager.resource.memory-mb default to 11520MB and 5760MB with HBase installed. I'm not using HBase, but I believe it is installed on my cluster. Would removing HBase free up lots of memory? Is that yarn.nodemanager.resource.memory-mb
setting the most relevant setting for available memory?
When I tell spark-submit --executor-memory
is that per core or for the whole worker?
When I get the error Required executor memory (2048+4096 MB)
, the first value (2048) is what I pass to --executor-memory
and I can change it and see the error message change accordingly. What is the second 4096MB value? How can I change that? Should I change that?
I tried to post this issue to AWS developer forum (https://forums.aws.amazon.com/forum.jspa?forumID=52) and I get the error "Your message quota has been reached. Please try again later." when I haven't even posted anything? Why would I not have permissions to post a question there?
Upvotes: 2
Views: 1987
Reputation: 11593
Yes, if hbase is installed, it will use quite a bit of memory be default. You should not put it on your cluster unless you need it.
Your error would make sense if there was only 1 core node. 6G (4G for the 2 executors, 2G for the driver) would be more memory than your resource manager would have to allocate. With a 2 node core, you should actually be able to allocate 3 2G executors. 1 on the node with the driver, 2 on the other.
In general, this sheet could help make sure you get the most out of your cluster.
Upvotes: 1