Reputation: 61
How can I create the sparksession?
scala> import org.apache.spark.SparkConf
import org.apache.spark.SparkConf
scala> import org.apache.spark.SparkContext
import org.apache.spark.SparkContext
scala> val conf = SparkSession.builder.master("local").appName("testing").enableHiveSupport().getOrCreate()
<console>:27: error: not found: value SparkSession
val conf = SparkSession.builder.master("local").appName("testing").enableHiveSupport().getOrCreate()
Upvotes: 6
Views: 21991
Reputation: 28219
Scala Spark:
import org.apache.spark.sql.SparkSession
val conf = SparkSession.builder.master("local").appName("testing").enableHiveSupport().getOrCreate()
PySpark:
from pyspark.sql import SparkSession
spark = SparkSession.builder \
.appName("testing") \
.enableHiveSupport() \
.getOrCreate()
Add .config("spark.some.config.option", "some-value")
before .getOrCreate()
to set config.
Refer: Scala Docs, Python Docs
Upvotes: 0
Reputation: 19318
As undefined_variable mentioned, you need to run import org.apache.spark.sql.SparkSession
to access the SparkSession
class.
It was also mentioned that you don't need to create your own SparkSession in the Spark console because it's already created for you.
Notice the "Spark session available as 'spark'" message when the console is started.
You can run this code in the console but it actually doesn't create a new SparkSession:
val conf = SparkSession.builder.master("local").appName("testing").enableHiveSupport().getOrCreate()
The getOrCreate
portion tells Spark to use an existing SparkSession if it exists and only create a new SparkSession if necessary. In this case, the Spark Shell created a SparkSession, so the existing SparkSession will be used.
conf == spark // true
See this post for more information on how to manage the SparkSession in production applications.
Upvotes: 1
Reputation: 6218
SparkSession is available in spark 2.x
import org.apache.spark.sql.SparkSession
Though when you start spark shell SparkSession is already available as spark
variable.
Upvotes: 5