Reputation: 3121
I'm trying to check if 2 Double columns are equal in a Dataframe to a certain degree of precision, so 49.999999 should equal 50. Is it possible to create a UDF and use it in a where clause? I am using Spark 2.0 in Scala.
Upvotes: 1
Views: 1643
Reputation: 25909
assuming ctx is SQL context
ctx.udf.register("areEqual", (x: Double, y: Double, precision : Double) => abs(x-y)< prescision
and then
df.where(areEqual($"col1",$"col2",precision))
Upvotes: 0
Reputation: 86
You can use udf
but there is no need for that:
import org.apache.spark.sql.functions._
val precision: Double = ???
df.where(abs($"col1" - $"col2") < precision)
udf
call would work the same way, but be less efficient
df.where(yourUdf($"col1", $"col2"))
Upvotes: 6