Spark & Scala : How can I replace values in Dataframes in different columns

Question

I have this dataFrames :

+----+-------+-----------+...+------+----------------+---------+
|mot1|  brand|     device|...|action|Column_to_modify|New_value|
+----+-------+-----------+...------+----------------+---------+
|  09|  Tesla|         PC|...|modify|           brand|     Jeep|
|  10|  Tesla|SmallTablet|...|modify|           brand|     Jeep|
|  09|  Tesla|         PC|...|modify|           brand|     Jeep|
|  10|  Tesla|SmallTablet|...|modify|            mot1|       20|
|  10|  Tesla|SmallTablet|...|modify|            mot1|       20|
+----+-------+-----------+...+------+----------------+---------+

So how can I modify columns using the "Column_to_modify" and "New_value" columns ?

What I want is:

+----+-------+-----------+...+------+----------------+---------+
|mot1|  brand|     device|...|action|Column_to_modify|New_value|
+----+-------+-----------+...------+----------------+---------+
|  09|   Jeep|         PC|...|modify|           brand|     Jeep|
|  10|   Jeep|SmallTablet|...|modify|           brand|     Jeep|
|  09|   Jeep|         PC|...|modify|           brand|     Jeep|
|  20|  Tesla|SmallTablet|...|modify|            mot1|       20|
|  20|  Tesla|SmallTablet|...|modify|            mot1|       20|
+----+-------+-----------+...+------+----------------+---------+

Any ideas?

pasha701 · Accepted Answer

With UDF assigned to each column:

val df = List(
  ("09", "Tesla", "PC", "modify", "brand", "Jeep"),
  ("10", "Tesla", "SmallTablet", "modify", "brand", "Jeep"),
  ("09", "Tesla", "PC", "modify", "brand", "Jeep"),
  ("10", "Tesla", "SmallTablet", "modify", "mot1", "20"),
  ("10", "Tesla", "SmallTablet", "modify", "mot1", "20")
).toDF("mot1", "brand", "device", "action", "Column_to_modify", "New_value")

val modifyColumn = (colName: String, colValue: String, modifyColumnName: String, modifyColumnValue: String) =>
  if (colName.equals(modifyColumnName)) modifyColumnValue else colValue

val modifyColumnUDF = udf(modifyColumn)

val result = df
  .withColumn("mot1", modifyColumnUDF(lit("mot1"), $"mot1", $"Column_to_modify", $"New_value"))
  .withColumn("brand", modifyColumnUDF(lit("brand"), $"brand", $"Column_to_modify", $"New_value"))
result.show(false)

Output:

+----+-----+-----------+------+----------------+---------+
|mot1|brand|device     |action|Column_to_modify|New_value|
+----+-----+-----------+------+----------------+---------+
|09  |Jeep |PC         |modify|brand           |Jeep     |
|10  |Jeep |SmallTablet|modify|brand           |Jeep     |
|09  |Jeep |PC         |modify|brand           |Jeep     |
|20  |Tesla|SmallTablet|modify|mot1            |20       |
|20  |Tesla|SmallTablet|modify|mot1            |20       |
+----+-----+-----------+------+----------------+---------+

Spark & Scala : How can I replace values in Dataframes in different columns

Answers (2)

Related Questions

Spark &amp; Scala : How can I replace values in Dataframes in different columns

Answers (2)

Related Questions

Spark & Scala : How can I replace values in Dataframes in different columns