Reputation: 664
Iam currently having a json object say student.json. The Structure looks something like this
{"serialNo":"1","name":"Rahul"}
{"serialNo":"2","name":"Rakshith"}
case class Student(serialNo:Int,name:String)
student.json is a huge file which Iam planning to parse through a spark job. And the snippet :
import play.api.libs.json.{ Json, JsObject, JsString }
.....
.....
for(jsonLine <-sc.textFile("student.json")
student<- Json.parse(jsonLine).asOpt[Student])
yield(student.serialNumber -> student.name)
Is there a better way to do this??
Upvotes: 0
Views: 638
Reputation: 11200
If student.json
is a huge file, and each line is just a valid json object, you should do:
val myRdd = sc.textFile("student.json").map(l=> Json.parse(l).asOpt[Student])
If you want to get the RDD to your local master, you can:
val students = myRdd.collect()..// then you can do operate it in the old fashion way.
I saw you are importing play.api.libs.json
which is from the Play Framework. I don't think running a Spark program in a web application is a good idea...
Upvotes: 1