Ashutosh gupta
Ashutosh gupta

Reputation: 447

Pyspark not able to read csv file with special character(ø) as delimiter

My csv files looks like this:

idøageøname
1ø25øAshutosh
2ø21øShipra
3ø11øNimisha
4ø15øBhavya
5ø7øSammridha

I am not able to read this csv file(delimiter is ø). Pyspark command below is reading complete line as one column instead of 3.

df = spark.read.option("header", "true").option("sep", "ø").csv('file_path.csv')

Upvotes: 2

Views: 6363

Answers (1)

tifi90
tifi90

Reputation: 413

I've created the same csv on my machine and could read the data with "ISO-8859-1".

df = spark.read.option("header", "true").option("encoding", "ISO-8859-1").option("sep", "ø").csv('file_path.csv')

For more information on the encoding just check https://en.wikipedia.org/wiki/ISO/IEC_8859-1 and the code page layout

Upvotes: 3

Related Questions