Osy
Osy

Reputation: 115

Add new column with its data to existing DataFrame using

In scala I have a List[String] which I want to add as a new Column to an existing DataFrame.

Original DF:

Name  | Date
======|===========
Rohan | 2007-12-21
...   | ...
...   | ...

Suppose want to add a new Column of Department

Expected DF:

Name | Date       | Department
=====|============|============
Rohan| 2007-12-21 | Comp
...  | ...        | ...
...  | ...        | ...

How can I do this in Scala?

Upvotes: 1

Views: 843

Answers (2)

Osy
Osy

Reputation: 115

This solved my issue

val newrows = dataset.rdd.zipWithIndex.map(_.swap)
      .join(spark.sparkContext.parallelize(results).zipWithIndex.map(_.swap))
      .values
      .map { case (row: Row, x: String) => Row.fromSeq(row.toSeq :+ x) }

Still need some exact explanation of it.

Upvotes: 1

Sandeep Purohit
Sandeep Purohit

Reputation: 3692

You can do it with one way like just create the dataframe of name and listvalues and join both the dataframe with name column

Upvotes: 1

Related Questions