Learnis
Learnis

Reputation: 566

DataFrame to Array of Json(keys,values)

I have a dataframe as below

+-------------+-------------+-------------+
| columnName1 | columnName2 | columnName3 |
+-------------+-------------+-------------+
| 001         | 002         | 003         |
+-------------+-------------+-------------+
| 004         | 005         | 006         |
+-------------+-------------+-------------+

I want to convert to JSON as expected Below Format.

EXPECTED FORMAT

[[{"key":"columnName1","value":"001"},{"key":"columnName2","value":"002"},{"key":"columnName1","value":"003"}],[{"key":"columnName1","value":"004"},{"key":"columnName2","value":"005"},{"key":"columnName1","value":"006"}]]

Thanks in Advance

I can do DF.toJSON.collect().This gives [{"columnName1":"001","columnName2":"002","columnName3":"003"},{"columnName1":"004","columnName2":"005","columnName3":"006"}]

But i need in expected format

Upvotes: 1

Views: 220

Answers (2)

Vihit Shah
Vihit Shah

Reputation: 314

The answer already posted seems perfectly fine. I had to perform a bit of a tweak to make it working. Please find below!

val json = df.columns.map(c => concat(
lit("{\"key\": \""),
lit(c + "\","),
lit("\"value\": \""),
concat(col(c), lit("\"}")))
)

val answer = df.select(array(json: _*))
.collect()
.map(_.getAs[Seq[String]](0).mkString("[", ", ", "]")).mkString("[", ", ", "]"))

Check if you find this helpful!

Upvotes: 0

koiralo
koiralo

Reputation: 23099

You can manually create a json string from given columns and collect as list as below

val json = df.columns.map(c => concat(
  lit("{\"key\": \""),
  lit(c + "\","),
  lit("\"value\": \""),
  concat(col(c), lit("\"}")))
)

df.select(array(json: _*))
  .collect()
  .map(_.getAs[Seq[String]](0).mkString("[", ", ", "]"))

Upvotes: 2

Related Questions