Reputation: 13
I am trying to create an UDF function to replace some values in a DF. I have the following DF:
df1
+-------------+
| Periodicity |
+-------------+
| Monthly |
| Daily |
| Annual |
+-------------+
So if I find in this DF "Annual", I want to change it to "EveryYear" and if I find "Daily" to "EveryDay". This is what I am trying:
val modifyColumn = () => if (df1.col("Periodicity").equals("Annual")) "EveryYear"
val modifyColumnUDF = udf(modifyColumn)
val result = df1.withColumn("Periodicity", modifyColumnUDF(df1.col("Periodicity")))
But is giving me an EvaluateException. What am I doing wrong?
Upvotes: 0
Views: 371
Reputation: 870
You can use one of these approaches:
// First approach
dataFrame
.withColumn("Periodicity",
when(col("Periodicity") === "Annual", "EveryYear").otherwise(
when(col("Periodicity") === "Monthly", "EveryMonth").otherwise(
when(col("Periodicity") === "Daily", "EveryDay"))))
// Second approach
val permutations = Map("Annual" -> "EveryYear", "Monthly" -> "EveryMonth", "Daily" -> "EveryDay")
val medianUDF = udf[String, String]((origValue: String) => permutations(origValue))
dataFrame.withColumn("Periodicity", medianUDF(col("Periodicity")))
The second approach can be used if you have many permutations and/or want it to be configured dynamically.
Upvotes: 1