Reputation: 1335
How to calculate optimal memory setting for spark-submit command ?
I am bringing 4.5 GB data in Spark from Oracle and performing some transformation like join with a Hive table and writing it back to Oracle. My question is how to come up spark-submit command with optimal memory parameters.
spark-submit --master yarn-cluster --driver-cores 2 \
--driver-memory 2G --num-executors 10 \
--executor-cores 5 --executor-memory 2G \
--class com.spark.sql.jdbc.SparkDFtoOracle2 \
Spark-hive-sql-Dataframe-0.0.1-SNAPSHOT-jar-with-dependencies.jar
How to calculate, what should be the driver memory, how much driver/executor memory required, how many cores required etc. ?
Upvotes: 0
Views: 3429