Reputation: 515
I have my Apache Flink program:
import org.apache.flink.api.scala._
import scala.util.parsing.json._
object numHits extends App {
val env = ExecutionEnvironment.getExecutionEnvironment
val data=env.readTextFile("file:///path/to/json/file")
val j=data.map { x => ("\"\"\""+x+"\"\"\"") }
/*1*/ println( ((j.first(1).collect())(0)).getClass() )
/*2*/ println( ((j.first(1).collect())(0)) )
/*3*/ println( JSON.parseFull((j.first(1).collect())(0)) )
}
I want to parse the input JSON file into normal scala Map and for that I am using the default scala.util.parsing.json._
library.
The output of the first println
statement is class java.lang.String
which is required by the JSON parsing function.
Output of the second println
function is the actual JSON string appended and prepended by "\"\"\""
which is also required by the JSON parser.
Now at this point if I copy the output of the second println
command printed in the console and pass it to the JSON.parseFull()
function, it properly parses it.
Therefore the third println
function should properly parse the same string passed to it but it does not as it outputs a "None" string which means it failed.
Why does this happen and how can I make it work?
Upvotes: 0
Views: 657
Reputation: 515
You will have to just change
val j=data.map { x => ("\"\"\""+x+"\"\"\"") }
to
val j=data.map { x => x.replaceAll("\"", "\\\"") }
But the above code is not required as the code below will work:
val data=env.readTextFile("file:///path/to/json").flatMap( line => JSON.parseFull(line) )
Upvotes: 0
Reputation: 170713
Output of the second println function is the actual JSON string appended and prepended by "\"\"\"" which is also required by the JSON parser.
No, of course it isn't. This produces a string like """{}"""
, which isn't valid JSON and this properly rejected by the parser. When you write """{}"""
in Scala code, the quotes aren't part of the string itself, they just delimit the literal: the content of the string is {}
, which is valid JSON.
Upvotes: 2