EarthlingMATH
EarthlingMATH

Reputation: 13

Passing sparkSession as function parameters spark-scala

I'm in the process of generating tables using spark-scala and I am concerned about efficiency.

Would passing sparkSession make my program slower? Is it any slower than SparkSession.getOrCreate ?

I am using yarn as master.

Thanks in advance.

Upvotes: 0

Views: 4470

Answers (1)

Salim
Salim

Reputation: 2178

You can create Spark session once and pass around without losing any performance. However it is little inconvenient to modify method signature to pass in a session object. You can avoid that by simply calling getOrCreate in the functions to obtain the same global session without passing it. When getOrCreate is called it sets the current session as default SparkSession.setDefaultSession ad gives that back to you for other getOrCreat calls

    val spark : SparkSession = SparkSession.builder
      .appName("test")
      .master("local[2]")
      .getOrCreate()

    //pass in function
    function1(pass)
    
    //obtain without passing
    
    def function2(){
    val s = SparkSession.builder.getOrCreate()
    }

Upvotes: 3

Related Questions