Reputation: 165
I am analysing the following piece of code:
from pyspark.sql.functions import udf,col, desc
def error(value, pred):
return abs(value - pred)
udf_MAE = udf(lambda value, pred: MAE(value= value, pred = pred), FloatType())
I know an udf
is an user defined function, but I don't understand what that means? Because udf
wasn't define anywhere previously on the code?
Upvotes: 1
Views: 2727
Reputation: 139
User Defined Functions (UDFs) are useful when you need to define logic specific to your use case and when you need to encapsulate that solution for reuse. They should only be used when there is no clear way to accomplish a task using built-in functions..Azure DataBricks
Create your function (after you have made sure there is no built in function to perform similar task)
def greatingFunc(name):
return 'hello {name}!'
Then you need to register your function as a UDF by designating the following:
A name for access in Python (myGreatingUDF)
The function itself (greatingFunc)
The return type for the function (StringType)
myGreatingUDF = spark.udf.register("myGreatingUDF",greatingFunc,StringType())
Now you can call you UDF anytime you need it,
guest = 'John'
print(myGreatingUDF(guest))
Upvotes: 2