Reputation: 447
My csv files looks like this:
idøageøname
1ø25øAshutosh
2ø21øShipra
3ø11øNimisha
4ø15øBhavya
5ø7øSammridha
I am not able to read this csv file(delimiter is ø). Pyspark command below is reading complete line as one column instead of 3.
df = spark.read.option("header", "true").option("sep", "ø").csv('file_path.csv')
Upvotes: 2
Views: 6363
Reputation: 413
I've created the same csv on my machine and could read the data with "ISO-8859-1".
df = spark.read.option("header", "true").option("encoding", "ISO-8859-1").option("sep", "ø").csv('file_path.csv')
For more information on the encoding just check https://en.wikipedia.org/wiki/ISO/IEC_8859-1 and the code page layout
Upvotes: 3