Ideal Spark configuration

Question

am using Apache spark on HDFS with MapR in our project. We are facing issue running spark Jobs, as its failing after a small increase in the data. We are reading data from csv file, doing some trasnformation, aggreation and then storing in HBase.

Current Data size = 3TB

Available resources: Total nodes : 14 Memory Available : 1TB Total VCores : 450 Total Disk : 150 TB

Spark Conf: executorCores : 2 executorInstance : 50 executorMemory: 40GB minPartitions: 600

please suggest, if the above configuration looks fine, because the error am getting looks like it going outOfMemory.

Ideal Spark configuration

Answers (1)

Related Questions