Jesus Zuñiga
Jesus Zuñiga

Reputation: 135

How can create a new DataFrame from a list?

Hello guys i have this function that gets the row Values from a DataFrame, converts them into a list and the makes a Dataframe from it.

//Gets the row content from the "content column"
 val dfList  = df.select("content").rdd.map(r => r(0).toString).collect.toList

    val dataSet = sparkSession.createDataset(dfList)

   //Makes a new DataFrame
    sparkSession.read.json(dataSet)

What i need to do to make a list with other column values so i can have another DataFrame with the other columns values

val dfList  = df.select("content","collection", "h").rdd.map(r => {
      println("******ROW********")
      println(r(0).toString)
      println(r(1).toString)
      println(r(2).toString) //These have the row values from the other 
                             //columns in the select
    }).collect.toList

thanks

Upvotes: 0

Views: 57

Answers (1)

spats
spats

Reputation: 853

Approach doesn't look right, you don't need to collect dataframe to just add new columns. Try adding columns to directly to dataframe using withColumn() withColumnRenamed() https://docs.azuredatabricks.net/spark/1.6/sparkr/functions/withColumn.html.

If you want to bring columns from another dataframe try joining. In any case it's not good idea to use collect as it will bring all your data to driver.

Upvotes: 1

Related Questions