Reputation: 495
I have create a spark data-frame in following manner :
+----+-------+
| age| number|
+----+-------+
| 16| 12|
| 16| 13|
| 16| 14|
| 17| 15|
| 17| 16|
| 17| 17|
+----+-------+
I want to convert it in following json format :
[{
'age' : 16,
'name' : [12,13,14]
},{
'age' : 17,
'name' : [15,16,17]
}]
How can i achieve the same?
Upvotes: 0
Views: 91
Reputation: 1076
You can try to_json function. Something like this.
import spark.implicits._
val list = List((16,12), (16,13), (16,14), (17,15), (17,16), (17,17))
val df = spark.parallelize(list).toDF("age", "number")
val jsondf = df.groupBy($"age").agg(collect_list($"number").as("name"))
.withColumn("json", to_json(struct($"age", $"name")))
.drop("age", "name")
.agg(collect_list($"json").as("json"))
The results are below. I hope it helps.
+------------------------------------------------------------+
|json |
+------------------------------------------------------------+
|[{"age":16,"name":[12,13,14]}, {"age":17,"name":[15,16,17]}]|
+------------------------------------------------------------+
Upvotes: 2