Pyspark not able to read csv file with special character(ø) as delimiter

Question

My csv files looks like this:

idøageøname
1ø25øAshutosh
2ø21øShipra
3ø11øNimisha
4ø15øBhavya
5ø7øSammridha

I am not able to read this csv file(delimiter is ø). Pyspark command below is reading complete line as one column instead of 3.

df = spark.read.option("header", "true").option("sep", "ø").csv('file_path.csv')

tifi90 · Accepted Answer

I've created the same csv on my machine and could read the data with "ISO-8859-1".

df = spark.read.option("header", "true").option("encoding", "ISO-8859-1").option("sep", "ø").csv('file_path.csv')

For more information on the encoding just check https://en.wikipedia.org/wiki/ISO/IEC_8859-1 and the code page layout

Answers (1)