Read a file in pyspark with custom column and record delmiter

Question

Is there any way to use custom record delimiters while reading a csv file in pyspark. In my file records are separated by ** instead of newline. Is there any way of using this custom line/record separator when reading the csv into a PySpark dataframe? Also my column seperators are ';' The code below gets the columns correctly but it counts as only one row

from pyspark import SparkContext 
sc =  SparkSession.builder.appName('temp').getOrCreate()
df = sc.read.format('csv').option("header", "false").option("delimiter", ';').option("inferSchema", "true").load("some-file-on-s3")

Read a file in pyspark with custom column and record delmiter

Answers (1)

Related Questions