Efficient way to convert Dataframe to RDD in Scala/SPARK?

Question

I have a dataFrame = [CUSTOMER_ID ,itemType, eventTimeStamp, valueType] which I convert to RDD[(String, (String, String, Map[String, Int]))] by doing the following:

 val tempFile = result.map( {
     r => {
         val customerId = r.getAs[String]( "CUSTOMER_ID" )
         val itemType = r.getAs[String]( "itemType" )
         val eventTimeStamp = r.getAs[String]( "eventTimeStamp" )
         val valueType = r.getAs[Map[String, Int]]( "valueType" )
         (customerId, (itemType, eventTimeStamp, valueType))
          }
          } )

Since my my input is huge, this takes much time. Is there any efficient way to convert the df to RDD[(String, (String, String, Map[String, Int]))] ?

Efficient way to convert Dataframe to RDD in Scala/SPARK?

Answers (1)

Related Questions