Reputation: 969
I would like to read a Json file as Json without parsing. I do not want to use a data frame , I would only like to read it as a regular file with the format still intact. Any idea ? I tried reading using wholtextfile but that creates a df.
Upvotes: 4
Views: 9276
Reputation: 19308
The upickle library is the easiest "pure Scala" way to read a JSON file:
val jsonString = os.read(os.pwd/"src"/"test"/"resources"/"phil.json")
val data = ujson.read(jsonString)
data.value // LinkedHashMap("first_name" -> Str("Phil"), "last_name" -> Str("Hellmuth"), "birth_year" -> Num(1964.0))
See this post for more details.
The code snippet above is using os-lib to read the file from disk. If you're running the code in a cluster environment, you'll probably want to use a different library. It depends on where the file is located and your environment.
Avoid the other Scala JSON libraries cause they're hard to use.
Upvotes: 1
Reputation: 2481
Since you didn't accept the spark specific answer maybe you could try with a normal scala solution like this (using the spray-json library):
import spray.json._
val source = scala.io.Source.fromFile("yourFile.txt")
val lines = try source.mkString finally source.close()
val yourJson = lines.parseJson
Upvotes: 3
Reputation: 1211
I've noticed you specified the apache-spark tag, if you meant something for vanilla scala this answer will not be applicable. Using this code you can get an RDD[String]
which is the most text-style type of distributed data structure.
// Where sc is your spark context
> val textFile = sc.textFile("myFile.json")
textFile: org.apache.spark.rdd.RDD[String]
Upvotes: 0