Reputation: 322
When calling to a function from external class, in case of many calls, what will give me a better performance, lazy val
function or def
method?
So far, what I understood is:
def
method-
lazy val
lambda expression -
So, it may seem that using lazy val will reduce the need to evaluate the function every time, should it be preferred ?
I faced that when i'm producing UDF for Spark code, and i'm trying to understand which approach is better.
object sql {
def emptyStringToNull(str: String): Option[String] = {
Option(str).getOrElse("").trim match {
case "" => None
case "[]" => None
case "null" => None
case _ => Some(str.trim)
}
}
def udfEmptyStringToNull: UserDefinedFunction = udf(emptyStringToNull _)
def repairColumn_method(dataFrame: DataFrame, colName: String): DataFrame = {
dataFrame.withColumn(colName, udfEmptyStringToNull(col(colName)))
}
lazy val repairColumn_fun: (DataFrame, String) => DataFrame = { (df,colName) =>
df.withColumn(colName, udfEmptyStringToNull(col(colName)))
}
}
Upvotes: 0
Views: 761
Reputation: 7604
There's no need for you to use a lazy val
in this specific case. When you assign a function to a lazy val
, its results are not memoized, as you seem to think they are. Since the function itself is a plain function literal and not the result of an expensive computation (regardless of what goes on inside it), making it lazy is not useful. All it does is add overhead when accessing and calling it. A simple val
would be better, but making it a proper method would be best.
If you want memoization, see Is there a generic way to memoize in Scala? instead.
Ignoring your specific example, if the def
in question didn't take any arguments and both it and the lazy val
were simple values that were expensive to compute, I would go with the lazy val
if you're going to call it many times to avoid computing it over and over again.
If they were values that were very cheap to compute and you're not going to call it many times, or if they're expensive to compute but you're only going to call them once, I would go with a def
instead. There wouldn't be much difference if you used a lazy val
instead, but it would avoid making a couple of fields.
If they're somewhat cheap to compute but they're being called many times, it may be better to use a lazy val
simply because they'll be cached. However, you might want to look at your overall design before looking at such micro-optimizations.
Upvotes: 3