Dhrumil Shah
Dhrumil Shah

Reputation: 33

How to convert RDD to DF in spark scala?

I am new in spark. And I am trying to convert below RDD to dataframe but not succeed

val customerRDD = sc.textFile("file:///home/hduser/data//customer.txt") //custId,CustName,CustEmail,CustPhone //1,ABC,[email protected],+199240242234

Here I am trying to use customerRDD.toDF() method but not working

Also I have tried with createDataFrame() method but not able to get the idea

Does anyone can help How can I convert RDD to DF here?

Thanks

Upvotes: 0

Views: 385

Answers (1)

Ged
Ged

Reputation: 18108

An odd way of doing things these days, but if you must use an RDD to read a file with a header, then consult this https://sparkbyexamples.com/apache-spark-rdd/spark-load-csv-file-into-rdd/ and note specifically:

  • Skip the header of each file (can be seen)
  • Extract the columns yourself via map (can be seen)

Look at this for creating DF from RDD with schema using Structs, see https://sparkbyexamples.com/apache-spark-rdd/convert-spark-rdd-to-dataframe-dataset. You can

  • create a schema programmatically for a DF from RDD via createDataFrame()
  • or use default schema with implicits

Upvotes: 2

Related Questions