Reputation: 840
I've a DF that I'm explicitly converting into an RDD and trying to fetch each column's record. Not able to fetch each of them within a map. Below is what I've tried:
val df = sql("Select col1, col2, col3, col4, col5 from tableName").rdd
The resultant df becomes the member of org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]
Now I'm trying to access each element of this RDD via:
val dfrdd = df.map{x => x.get(0); x.getAs[String](1); x.get(3)}
The issue is, the above statement returns only the data present on the last transformation of map i.e., the data present on x.get(3)
. Can someone let me know what I'm doing wrong?
Upvotes: 3
Views: 6860
Reputation: 23109
The last line is always returned from the map
, In your case x.get(3)
gets returned.
To return multiple values you can return tuples
as below
val dfrdd = df.map{x => (x.get(0), x.getAs[String](1), x.get(3))}
Hope this helped!
Upvotes: 5