Luiz Viola
Luiz Viola

Reputation: 2436

Why I don't need to create a SparkSession in Databricks?

Why I don't need to create a SparkSession in Databricks? Is a SparkSession created automatically when the cluster is set? Or somebody else did it for me?

Upvotes: 9

Views: 15896

Answers (2)

Alex Ott
Alex Ott

Reputation: 87154

That is done only in the notebooks, to simplify user's work & avoiding them to specify different parameters, many of them won't have any effect because Spark is already started. This behavior is similar to what you get when you start spark-shell or pyspark - both of them initialize the SparkSession and SparkContext:

Spark context available as 'sc' (master = local[*], app id = local-1635579272032).
SparkSession available as 'spark'.

But if you're running code from jar or Python wheel as job, then it's your responsibility to create corresponding objects.

Upvotes: 10

In Databricks environment, Whereas in Spark 2.0 the same effects can be achieved through SparkSession, without expliciting creating SparkConf, SparkContext or SQLContext, as they’re encapsulated within the SparkSession. Using a builder design pattern, it instantiates a SparkSession object if one does not already exist, along with its associated underlying contexts.ref: link

Upvotes: 0

Related Questions