Punsh
Punsh

Reputation: 51

Unable to use a new SparkSession after stopping the previous one

I created a SparkSession (with enabledHiveSupport()) and running locally. I want to execute a set of sql's in one SparkSession, stop it and start another.

But when I stop a SparkSession and get a new one using the SparkSession.builder(), I get a new SparkSession object but the sql fails with "Another instance of Derby may have already booted the database.."

Since we can have one only one SparkContext per JVM, does this mean I cannot getOrCreate SparkSession, stop it and repeat?

Is there any way to execute a set of sql's in a new session every time? (I know there is SparkSession.newSession, though I can't stop that session too as the underlying shared SparkContext will stop, right?)

Upvotes: 2

Views: 4265

Answers (1)

Rajnish Kumar
Rajnish Kumar

Reputation: 2938

Hi you can go with SparkSession.newSession cause as per official documentation

SparkSession: Start a new session with isolated SQL configurations, temporary tables, registered functions are isolated, but sharing the underlying SparkContext and cached data.

Note
    Other than the SparkContext,all shared state is initialized lazily. This method will 
    force the initialization of the shared state to ensure that parent and child sessions 
    are set up with the same shared state. If the underlying catalog implementation is 
    Hive, this will initialize the metastore, which may take some time.

a sample code how you can use multiple spark session

object WorkimngWithNewSession {
  def main(args: Array[String]): Unit = {

    // creating a new session
    val spark = SparkSession
                .builder()
                .appName("understanding session")
                .master("local[*]")
                .getOrCreate()

    import spark.implicits._
    val df = Seq("name","apple").toDF()

    df.createOrReplaceTempView("testTable") // do not call this in case of multiple spark session and if u are using this view in all of them.
    df.createOrReplaceGlobalTempView("testTable") // call this for multiple spark session

    spark.sql("SELECT * FROM testTable").show()
    // spark.stop()  // do not call this as it will stop sparkContext

    val newSpark = spark.newSession()
    newSpark.sql("SELECT * FROM global_temp.testTable").show() // call global view by using global_temp


    spark.stop() // if u want u can call this line in the end to close all spark session
  }
}

Upvotes: 3

Related Questions