Scala: For loop on dataframe, create new column from existing by index

Question

I have a dataframe with two columns:

id (string), date (timestamp)

I would like to loop through the dataframe, and add a new column with an url, which includes the id. The algorithm should look something like this:

 add one new column with the following value:
 for each id
       "some url" + the value of the dataframe's id column

I tried to make this work in Scala, but I have problems with getting the specific id on the index of "a"

 val k = df2.count().asInstanceOf[Int]
      // for loop execution with a range
      for( a <- 1 to k){
         // println( "Value of a: " + a );
         val dfWithFileURL = dataframe.withColumn("fileUrl", "https://someURL/" + dataframe("id")[a])

      }

But this

dataframe("id")[a]

is not working with Scala. I could not find solution yet, so every kind of suggestions are welcome!

wBob · Accepted Answer

You can simply use the withColumn function in Scala, something like this:

val df = Seq(
  ( 1, "1 Jan 2000" ),
  ( 2, "2 Feb 2014" ),
  ( 3, "3 Apr 2017" )
)
  .toDF("id", "date" )


// Add the fileUrl column
val dfNew = df
  .withColumn("fileUrl", concat(lit("https://someURL/"), $"id"))
  .show

My results:

Scala: For loop on dataframe, create new column from existing by index

Answers (2)

Related Questions