Reputation: 734
I need to produce a delimited file where each row it separated by a '^' and columns are delimited by '|'.
There don't seem to be options to change the row delimiter for csv output type.
eg:
df.coalesce(1).write\
.format("com.databricks.spark.csv")\
.mode("overwrite")\
.option("header", "true")\
.option("sep","|")\
# no options for setting lineSep to '^'
.save(destination_path)
Upvotes: 1
Views: 5548
Reputation: 734
In pyspark version 3+ there is an option to set line separator:
df.coalesce(1).write\
.format("com.databricks.spark.csv")\
.mode("overwrite")\
.option("header", "true")\
.option("sep","|")\
.option("lineSep","^")\
.save(destination_path)
Upvotes: 0
Reputation: 26
One solution consists of to convert the DataFrame to rdd :
df.rdd.map(x=>x.mkString("^")).saveAsTextFile("OutCSV")
Upvotes: 1