vdep
vdep

Reputation: 3590

parallelize() method while using SparkSession in Spark 2.0

I see that SparkSession doesn't have .parallelize() method, Do we need to use SparkContext again to create a RDD?. If so, is creating both SparkSession & SparkContext in a single program advisable?

Upvotes: 20

Views: 19370

Answers (2)

loneStar
loneStar

Reputation: 4010

There is method of spark Context in the SparkSession Class

val data = spark.sparkContext.parallelize(Seq(1,2,3,4))
data: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at <console>:23

Upvotes: 1

eliasah
eliasah

Reputation: 40380

Once you build your SparkSession, you can fetch the underlying SparkContext created with it as followed :

Let's consider that SparkSession is already defined :

val spark : SparkSession = ??? 

You can get SparkContext now :

val sc = spark.sparkContext

Upvotes: 27

Related Questions