Reputation: 2083
I started working on Scala very recently and came across its feature called Future. I had posted a question for help with my code and some help from it.
In that conversation, I was told that it is not recommended to retrieve the value from a Future.
I understand that it is a parallel process when executed but if the value of a Future is not recommended to be retrieved, how/when do I access the result of it ? If the purpose of Future is to run a thread/process independent of main thread, why is it that it is not recommended to access it ? Will the Future automatically assign its output to its caller ? If so, how would we know when to access it ?
I wrote the below code to return a Future with a Map[String, String]
.
def getBounds(incLogIdMap:scala.collection.mutable.Map[String, String]): Future[scala.collection.mutable.Map[String, String]] = Future {
var boundsMap = scala.collection.mutable.Map[String, String]()
incLogIdMap.keys.foreach(table => if(!incLogIdMap(table).contains("INVALID")) {
val minMax = s"select max(cast(to_char(update_tms,'yyyyddmmhhmmss') as bigint)) maxTms, min(cast(to_char(update_tms,'yyyyddmmhhmmss') as bigint)) minTms from queue.${table} where key_ids in (${incLogIdMap(table)})"
val boundsDF = spark.read.format("jdbc").option("url", commonParams.getGpConUrl()).option("dbtable", s"(${minMax}) as ctids")
.option("user", commonParams.getGpUserName()).option("password", commonParams.getGpPwd()).load()
val maxTms = boundsDF.select("minTms").head.getLong(0).toString + "," + boundsDF.select("maxTms").head.getLong(0).toString
boundsMap += (table -> maxTms)
}
)
boundsMap
}
If I have to use the value which is returned from the method getBounds, can I access it in the below way ?
val tmsobj = new MinMaxVals(spark, commonParams)
tmsobj.getBounds(incLogIds) onComplete ({
case Success(Map) => val boundsMap = tmsobj.getBounds(incLogIds)
case Failure(value) => println("Future failed..")
})
Could anyone care to clear my doubts ?
Upvotes: 0
Views: 1160
Reputation: 51271
As the others have pointed out, waiting to retrieve a value from a Future
defeats the whole point of launching the Future
in the first place.
But onComplete()
doesn't cause the rest of your code to wait, it just attaches extra instructions to be carried out as part of the Future
thread while the rest of your code goes on its merry way.
So what's wrong with your proposed code to access the result of getBounds()
? Let's walk through it.
tmsobj.getBounds(incLogIds) onComplete { //launch Future, when it completes ...
case Success(m) => //if Success then store the result Map in local variable "m"
val boundsMap = tmsobj.getBounds(incLogIds) //launch a new and different Future
//boundsMap is a local variable, it disappears after this code block
case Failure(value) => //if Failure then store error in local variable "value"
println("Future failed..") //send some info to STDOUT
}//end of code block
You'll note that I changed Success(Map)
to Success(m)
because Map
is a type (it's a companion object) and can't be used to match the result of your Future
.
In conclusion: onComplete()
doesn't cause your code to wait on the Future
, which is good, but it is somewhat limited because it returns Unit
, i.e. it has no return value with which it can communicate the result of the Future
.
Upvotes: 4
Reputation: 27421
It is not recommended to wait for a Future
using Await.result
because this blocks the execution of the current thread until some unknown point in the future, possibly forever.
It is perfectly OK to process the value of a Future
by passing a processing function to a call such as map
on the Future
. This will call your function when the future is complete. The result of map
is another Future
, which can, in turn, be processed using map
, onComplete
or other methods.
Upvotes: 1
Reputation: 4595
TLDR; Futures
are not meant to manage shared state but they are good for composing asynchronous pieces of code. You can use map
, flatMap
and many other operations to combine Futures
.
The computation that the Future
represents will be executed using the given ExecutionContext
(usually given implicitly), which will usually be on a thread-pool, so you are right to assume that the Future
computation happens in parallel. Because of this concurrency, it is generally not advised to mutate state that is shared from inside the body of the Future
, for example:
var i: Int = 0
val f: Future[Unit] = Future {
// Some computation
i = 42
}
Because you then run the risk of also accessing/modifying i
in another thread (maybe the "main" one). In this kind of concurrent access situation, Futures
would probably not be the right concurrency model, and you could imagine using monitors or message-passing instead.
Another possibility that is tempting but also discouraged is to block the main thread until the result becomes available:
val f: Future[Init] = Future { 42 }
val i: Int = Await.result(f)
The reason this is bad is that you will completely block the main thread, annealing the benefits of having concurrent execution in the first place. If you do this too much, you might also run in trouble because of a large number of threads that are blocked and hogging resources.
How do you then know when to access the result? You don't and it's actually the reason why you should try to compose Futures
as much as possible, and only subscribe to their onComplete
method at the very edge of your application. It's typical for most of your methods to take and return Futures
, and only subscribe to them in very specific places.
Upvotes: 3