Newbie
Newbie

Reputation: 741

How to convert a dataframe to RDD[String, String]?

How to convert a dataframe to RDD[String, String] ?

I have a data frame

df : [id : String, coutry :String, title: String]

How to do I convert it to RDD[String, String] where the first column would be key and the json string made of remaining columns would be value ?

key : id
value : {coutry: "US", title : "MK"}

Upvotes: 0

Views: 7074

Answers (2)

sarveshseri
sarveshseri

Reputation: 13985

You can not have a RDD[String, String]. RDD takes only 1 type parameter so what you want is RDD[(String, String)].

df.rdd
  .map(row => {
    val id = row.getString(0)
    val country = row.getString(1)
    val title = row.getString(2)

    val jsonString = s"{country: $country, title: $title}"

    (id, jsonString)
  })

Upvotes: 2

Tom
Tom

Reputation: 6342

There is DataFrame.toJSON that returns an RDD[String],based on this method,you can do the transformation yourself

Upvotes: 0

Related Questions