Reputation: 13
I'm in the process of generating tables using spark-scala and I am concerned about efficiency.
Would passing sparkSession make my program slower? Is it any slower than SparkSession.getOrCreate ?
I am using yarn as master.
Thanks in advance.
Upvotes: 0
Views: 4470
Reputation: 2178
You can create Spark session once and pass around without losing any performance.
However it is little inconvenient to modify method signature to pass in a session object. You can avoid that by simply calling getOrCreate
in the functions to obtain the same global session without passing it. When getOrCreate
is called it sets the current session as default SparkSession.setDefaultSession
ad gives that back to you for other getOrCreat
calls
val spark : SparkSession = SparkSession.builder
.appName("test")
.master("local[2]")
.getOrCreate()
//pass in function
function1(pass)
//obtain without passing
def function2(){
val s = SparkSession.builder.getOrCreate()
}
Upvotes: 3