Reputation: 57
Am creating a new dataframe df using the below withcolumn conditions.I have the same usage of the below withcolumn conditions for other dataframes too.How to write these all withcolumn conditions as a generic function and access it across all dataframes.
val df = sampledf.withColumn("concat", concat($"columna", $"columnb", $"columnc"))
.withColumn("sub", $"columna" - $"columnb")
.withColumn("div", $"columna" / $"columnb")
.withColumn("mul", $"columna" * $"columnb")
Upvotes: 1
Views: 167
Reputation: 10382
Use higher order functions
.
Check below code.
Defining common function.
scala> def func(
f: (Column,Column) => Column,
cols:Column*
): Column = cols.reduce(f)
Sample DataFrame
scala> df.show(false)
+-------+-------+-------+
|columna|columnb|columnc|
+-------+-------+-------+
|1 |2 |3 |
+-------+-------+-------+
Creating Expressions.
scala> val colExpr = Seq(
| $"columna",
| $"columnb",
| $"columnc",
| func(concat(_,_),$"columna",$"columnb",$"columnc").as("concat"),
| func((_ / _),$"columna",$"columnb").as("div"),
| func((_ * _),$"columna",$"columnb").as("mul"),
| func((_ + _),$"columna",$"columnb").as("add"),
| func((_ - _),$"columna",$"columnb").as("sub")
| )
Applying Expressions.
scala> df.select(colExpr:_*).show(false)
+-------+-------+-------+------+---+---+---+---+
|columna|columnb|columnc|concat|div|mul|add|sub|
+-------+-------+-------+------+---+---+---+---+
|1 |2 |3 |123 |0.5|2 |3 |-1 |
+-------+-------+-------+------+---+---+---+---+
Check this post for more details.
Upvotes: 0
Reputation: 19328
Here's a reusable function:
def yourFunction()(df: DataFrame) = {
df.withColumn("concat", concat($"columna", $"columnb", $"columnc"))
.withColumn("sub", $"columna" - $"columnb")
.withColumn("div", $"columna" / $"columnb")
.withColumn("mul", $"columna" * $"columnb")
}
Here's how you can use the function:
val df = sampledf.transform(yourFunction())
See this post for more information about chaining DataFrame transformations with Spark. It's a really important design pattern to write clean Spark code.
Upvotes: 3