Reputation: 944
I'm trying to pass a value to my Spark program which would be used as a delimiter to read a .dat file. My code looks something like this
val delim = args(0)
val df = spark.read.format("csv").option("delimiter", delim).load("/path/to/file/")
And I run the program as following command -
spark2-submit --class a.b.c.MyClass My.jar \\u0001
But I get an error saying that multiple characters can't be used as delimiter. But when I directly use the String instead of getting it as a variable, the code works fine
val df = spark.read.format("csv").option("delimiter", "\u0001").load("/path/to/file/")
Can someone help me with this?
Upvotes: 2
Views: 571
Reputation: 10703
String "\u0001"
is a unicode character, but what is passed to spark from the command line is a literal string "\\u0001"
. You need to explicitly unescape Unicode:
val df = spark.read.format("csv").option("delimiter", unescapeUnicode(delim)).load("/path/to/file/")
Find unescapeUnicode
function in this answer.
Upvotes: 3