alexgbelov
alexgbelov

Reputation: 3121

How to use UDF in where clause in Scala Spark

I'm trying to check if 2 Double columns are equal in a Dataframe to a certain degree of precision, so 49.999999 should equal 50. Is it possible to create a UDF and use it in a where clause? I am using Spark 2.0 in Scala.

Upvotes: 1

Views: 1643

Answers (2)

Arnon Rotem-Gal-Oz
Arnon Rotem-Gal-Oz

Reputation: 25909

assuming ctx is SQL context

ctx.udf.register("areEqual", (x: Double, y: Double, precision : Double) => abs(x-y)< prescision

and then

df.where(areEqual($"col1",$"col2",precision))

Upvotes: 0

user9143381
user9143381

Reputation: 86

You can use udf but there is no need for that:

import org.apache.spark.sql.functions._

val precision: Double = ???

df.where(abs($"col1" - $"col2") < precision)

udf call would work the same way, but be less efficient

df.where(yourUdf($"col1", $"col2"))

Upvotes: 6

Related Questions