qubiter
qubiter

Reputation: 245

Save dataframe as JSON in specific structure in Spark Scala

I have a dataframe df which can be saved as json file in the following structure: {"id":"1234567890","score":123.0,"date":yyyymmdd}

for first instance I am saving it as follows:

df.write.format("json").save("path")

This df needs to saved as json file in the following structure id::1234567890\t{"id":"1234567890","score":123.0,"date":yyyymmdd}

I tried various ways but couldn't do it. How can we save it in the desired format?

Spark version: 1.6.0
Scala version: 2.10.6

Upvotes: 1

Views: 3351

Answers (1)

Nils
Nils

Reputation: 446

That is not json format. You are better off using an rdd and then transforming it into that custom format.

final case class LineOfSomething(id: String, score: BigDecimal, date: String)
import sqlContext.implicits._
df
  .as[LineOfSomething]
  .rdd
  .mapPartitions(lines => {
    val mapper = new com.fasterxml.jackson.databind.ObjectMapper()
    mapper.registerModule(com.fasterxml.jackson.module.scala.DefaultScalaModule)
    lines.map(line => {
      val json = mapper.writeValueAsString(line)
      s"id::${line.id}\t$json"
    })
  })
  .saveAsTextFile(output)

Upvotes: 1

Related Questions