Reputation: 245
I have a dataframe
df which can be saved as json
file in the following structure: {"id":"1234567890","score":123.0,"date":yyyymmdd}
for first instance I am saving it as follows:
df.write.format("json").save("path")
This df needs to saved as json
file in the following structure id::1234567890\t{"id":"1234567890","score":123.0,"date":yyyymmdd}
I tried various ways but couldn't do it. How can we save it in the desired format?
Spark version: 1.6.0
Scala version: 2.10.6
Upvotes: 1
Views: 3351
Reputation: 446
That is not json format. You are better off using an rdd and then transforming it into that custom format.
final case class LineOfSomething(id: String, score: BigDecimal, date: String)
import sqlContext.implicits._
df
.as[LineOfSomething]
.rdd
.mapPartitions(lines => {
val mapper = new com.fasterxml.jackson.databind.ObjectMapper()
mapper.registerModule(com.fasterxml.jackson.module.scala.DefaultScalaModule)
lines.map(line => {
val json = mapper.writeValueAsString(line)
s"id::${line.id}\t$json"
})
})
.saveAsTextFile(output)
Upvotes: 1