Baktaawar
Baktaawar

Reputation: 7490

How to break lines into multiple lines in Pyspark

I know in Python one can use backslash or even parentheses to break line into multiple lines.

But somehow in pyspark when I do this, i do get the next line as red which kind of shows something might be wrong.

(conf.setAppName('Learnfit_Recommender')
 .set("spark.executor.memory", "10g")
 .set("spark.executor.cores",5)
 .set("spark.executor.instances",50)
 .set("spark.yarn.executor.memoryOverhead",1024)
)

EDIT 1: I changed the parentheses to backslash. And if you see the image, I see few '.' as red and even the sc variable is marked as red.

enter image description here

Is this the correct way to break lines in pyspark?

Upvotes: 10

Views: 41342

Answers (3)

Yang Bryan
Yang Bryan

Reputation: 451

There is no need to add blank space before backslash in PySpark.

conf = SparkConf()

conf.setAppName('appName')\
.set("spark.executor.memory","10g")\
.set("spark.executor.cores",5) 

sc = sparkContext(conf=conf)

Upvotes: 4

avr
avr

Reputation: 4883

You can use either backslash or parenthesis to break the lines in pyspark as you do in python.

You can find them used in official spark python examples in spark website here

Upvotes: 0

gold_cy
gold_cy

Reputation: 14226

You can use slashes and parenthesis

spark = SparkSession \
    .builder \
    .appName("Python Spark SQL basic example") \
    .config("spark.some.config.option", "some-value") \
    .getOrCreate()

Edit: and an example from a Spark Submit job

./bin/spark-submit \
--master <yarn> \
--deploy-mode <cluster> \
--num-executors <2> \
--executor-cores <2> \

Upvotes: 11

Related Questions