Reputation: 115
I have udf function for calculating distance between 2 coordinates.
import org.apache.spark.sql.functions._
import scala.math._
def calculateDistance(la1:Double, lo1:Double,la2:Double,lo2:Double): Double => udf(
{
val R = 6373.0
val lat1 = toRadians(la1)
val lon1 = toRadians(lo1)
val lat2 = toRadians(la2)
val lon2 = toRadians(lo2)
val dlon = lon2 - lon1
val dlat = lat2 - lat1
val a = pow(sin(dlat / 2),2) + cos(lat1) * cos(lat2) * pow(sin(dlon / 2),2)
val c = 2 * atan2(sqrt(a), sqrt(1 - a))
val distance = R * c
}
)
Here is the dataframe schema .
dfcity: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [Name: string, LAT: double ... 10 more fields]
root
|-- SCITY: string (nullable = true)
|-- LAT: double (nullable = true)
|-- LON: double (nullable = true)
|-- ADD: integer (nullable = true)
|-- CODEA: integer (nullable = true)
|-- CODEB: integer (nullable = true)
|-- TCITY: string (nullable = true)
|-- TLAT: double (nullable = true)
|-- TLON: double (nullable = true)
|-- TADD: integer (nullable = true)
|-- TCODEA: integer (nullable = true)
|-- TCODEB: integer (nullable = true)
When trying using withColumn
val dfcitydistance = dfcity.withColumn("distance", calculateDistance($"LAT", $"LON",$"TLAT", $"TLON"))
it generates error:
6: error: too many arguments for method calculateDistance: (distance: Double)
What's wrong in the code the passing column to UDF? Please advise. Thank you very much.
Upvotes: 0
Views: 1587
Reputation: 22439
There is a couple of issues with your code:
def calculateDistance(la1:Double, lo1:Double, la2:Double, lo2:Double): Double => udf( {
// ...
val distance = R * c
} )
val distance = R * c
is an assignment, hence will return a Unit
. You should either append a line with just distance
or simply replace the assignment expression with R * c
.Your UDF should look like the following:
val calculateDistance = udf( (la1:Double, lo1:Double, la2:Double, lo2:Double) => {
// ...
R * c
} )
Upvotes: 1
Reputation: 11
It should be
val calculateDistance = udf((la1:Double, lo1:Double,la2:Double,lo2:Double) => {
...
})
The function you define right now is a functions which takes local objects and returns nullary UDF
Upvotes: 1