Reputation: 298
i'm new to Pyspark Please help me with it:
spark = SparkSession.builder.appName("FlightDelayRDD").master("local[*]").getOrCreate()
sc = spark.sparkContext
sc.setSystemProperty("spark.dynamicAllocation.enabled", "true")
sc.setSystemProperty("spark.dynamicAllocation.initialExecutors", "6")
sc.setSystemProperty("spark.dynamicAllocation.minExecutors", "6")
sc.setSystemProperty("spark.dynamicAllocation.schedulerBacklogTimeout", "0.5s")
sc.setSystemProperty("spark.speculation", "true")
I want to set KryoSerializer in pyspark like i configured above.
Upvotes: 1
Views: 4475
Reputation: 31540
Since Spark 2.0.0, we internally use Kryo serializer when shuffling RDDs with simple types, arrays of simple types, or string type.
To set Kryo serializer:
sc.setSystemProperty("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
To check:
spark.sparkContext.getConf().get("spark.serializer")
#u'org.apache.spark.serializer.KryoSerializer'
Upvotes: 2