Y. Eliash
Y. Eliash

Reputation: 2068

Simple spark job fail due to GC overhead limit

I've created a standalone spark (2.1.1) cluster on my local machines with 9 cores / 80G each machine (total of 27 cores / 240G Ram)

I've got a sample spark job that sum all the numbers from 1 to x this is the code :

package com.example

import org.apache.spark.sql.SparkSession

object ExampleMain {

    def main(args: Array[String]): Unit = {
      val spark = SparkSession.builder
          .master("spark://192.168.1.2:7077")
          .config("spark.driver.maxResultSize" ,"3g")
          .appName("ExampleApp")
          .getOrCreate()
      val sc = spark.SparkContext
      val rdd = sc.parallelize(Lisst.range(1, 1000))
      val sum = rdd.reduce((a,b) => a+b)
      println(sum)
      done
    }

    def done = {
      println("\n\n")
      println("-------- DONE --------")
    }
}

When running the above code I get results after a few seconds so I've crancked up the code to sum all the numbers from 1 to 1B (1,000,000,000) and than I get GC overhead limit reached

I read that spark should spill memory to the HDD if there isn't enough memory, I've tried to play with my cluster configuration but that didn't helped.

Driver memory = 6G
Number of workers = 24
Cores per worker = 1
Memory per worker = 10

I'm not a developer, and have no knowledge in Scala but would like to find a solution to run this code without GC issues.

Per @philantrovert request I'm adding my spark-submit command

/opt/spark-2.1.1/bin/spark-submit \
--class "com.example.ExampleMain" \
--master spark://192.168.1.2:6066 \
--deploy-mode cluster \
/mnt/spark-share/example_2.11-1.0.jar

In addition my spark/conf are as following:

Thanks

Upvotes: 2

Views: 426

Answers (1)

Raphael Roth
Raphael Roth

Reputation: 27373

I suppose the problem is that you create a List with 1 Billion entries on the driver, which is a huge datastructure (4GB). There is a more efficient way the programmatically create an Dataset/RDD:

val rdd = spark.range(1000000000L).rdd

Upvotes: 3

Related Questions