Reputation: 41
I am using Spark SQL for extracting some information from a JSON file. The question is I want to save the result from the SQL analysis into another JSON for plotting it with Plateau or with d3.js. The thing is I don´t know exactly how to do it. Any suggestion?
val inputTable = sqlContext.jsonFile(inputDirectory).cache()
inputTable.registerTempTable("inputTable")
val languages = sqlContext.sql("""
SELECT
user.lang,
COUNT(*) as cnt
FROM tweetTable
GROUP BY user.lang
ORDER BY cnt DESC
LIMIT 15""")
languages.rdd.saveAsTextFile(outputDirectory + "/lang")
languages.collect.foreach(println)
I don´t mind if I save my data into a .csv file but I don´t know exactly how to do it.
Thanks!
Upvotes: 3
Views: 9406
Reputation: 6059
It is just
val languagesDF: DataFrame = sqlContext.sql("<YOUR_QUERY>")
languagesDF.write.json("your.json")
You do not need to go back to a RDD
.
Still, take care, that your JSON will be split into multiple parts. If that is not your intention, read
on how to circumvent this (if really required). The main point is in using repartition
or coalesce
.
Upvotes: 4