Reputation: 741
How to convert a dataframe to RDD[String, String] ?
I have a data frame
df : [id : String, coutry :String, title: String]
How to do I convert it to RDD[String, String] where the first column would be key and the json string made of remaining columns would be value ?
key : id
value : {coutry: "US", title : "MK"}
Upvotes: 0
Views: 7074
Reputation: 13985
You can not have a RDD[String, String]
. RDD takes only 1 type parameter
so what you want is RDD[(String, String)]
.
df.rdd
.map(row => {
val id = row.getString(0)
val country = row.getString(1)
val title = row.getString(2)
val jsonString = s"{country: $country, title: $title}"
(id, jsonString)
})
Upvotes: 2
Reputation: 6342
There is DataFrame.toJSON that returns an RDD[String],based on this method,you can do the transformation yourself
Upvotes: 0