Reputation: 1233
I was looking at one of the sample codes (given below). I notice what looks like an anonymous function (line below //what is this comment in the snippet) defined inside this method. What exactly is this and how does this get invoked?
def initHasher(requestFilePath: String) = {
import spark.implicits._
val hashes = spark.read.option("delimiter", ",").option("header", "true").csv(requestFilePath)
.select($"Hash", $"Count").rdd
.map(r => (r.getString(0), r.getString(1))).collectAsMap()
val broadcastedHashes = spark.sparkContext.broadcast(hashes)
// What is this?
(str: String) => {
if (str != null && str.length > 0) {
val hash = sha256hash(str)
broadcastedHashes.value.get(hash) match {
case None => hash
case Some(count) => sha256hash(str + ":" + count)
}
}
else
null
}
}
Upvotes: 0
Views: 277
Reputation: 42440
initHasher
initializes a hasher and returns it as a function (the anonymous function you are seeing). It would be used like this:
// initialize your hasher here
val hasher = initHasher(requestFilePath)
// now you can use the hasher
val hash = hasher("my string")
Upvotes: 4