Reputation: 10411
Consider a scenario in which I am implementing a system that processes incoming tasks using Akka. I have a primary actor that receives tasks and dispatches them to some worker actors that process the tasks.
My first instinct is to implement this by having the dispatcher create an actor for each incoming task. After the worker actor processes the task it is stopped.
This seems to be the cleanest solution for me since it adheres to the principle of "one task, one actor". The other solution would be to reuse actors - but this involves the extra-complexity of cleanup and some pool management.
I know that actors in Akka are cheap. But I am wondering if there is an inherent cost associated with repeated creation and deletion of actors. Is there any hidden cost associated with the data structures Akka uses for the bookkeeping of actors ?
The load should be of the order of tens or hundreds of tasks per second - think of it as a production webserver that creates one actor per request.
Of course, the right answer lies in the profiling and fine tuning of the system based on the type of the incoming load. But I wondered if anyone could tell me something from their own experience ?
LATER EDIT:
I should given more details about the task at hand:
Note: I appreciate alternative solutions to my problem, and I will certainly take them into consideration. However, I would also like an answer to the main question regarding the intensive creation and deletion of actors in Akka.
Upvotes: 28
Views: 6762
Reputation: 2081
Actors make great finite state machines so let that help drive your design here. If your request handling state is greatly simplified by having one actor per request then do that. I find that actors are particularly good at managing more than two states as a rule of thumb.
Commonly though, one request handling actor that references request state from within a collection that it maintains as part of its own state is a common approach. Note that this can also be achieved with an Akka reactive stream and the use of the scan stage.
Upvotes: 0
Reputation: 24413
You should not create an actor for every request, you should rather use a router to dispatch the messages to a dynamic amount of actors. That's what routers are for. Read this part of the docs for more information: http://doc.akka.io/docs/akka/2.0.4/scala/routing.html
edit:
Creating top-level actors (system.actorOf
) is expensive, because every top-level actor will initialize an error kernel as well and those are expensive. Creating child actors (inside an actor context.actorOf
) is way cheaper.
But still I suggest you to rethink this, because depending on the frequency of the creation and deletion of actors you will also put afditional pressure on the GC.
edit2:
And most important, actors are not threads! So even if you create 1M actors, they will only run on as many threads as the pool has. So depending on the throughput setting in the config every actor will process n messages before the thread gets released to the pool again.
Note that blocking a thread (includes sleeping) will NOT return it to the pool!
Upvotes: 22
Reputation: 1713
I've tested with 10000 remote actors created from some main
context by a root
actor, same scheme as in prod module a single actor was created. MBP 2.5GHz x2:
Code:
def start(userName: String) = {
logger.error("HELLOOOOOOOO ")
val n: Int = 10000
var t0, t1: Long = 0
t0 = System.nanoTime
for (i <- 0 to n) {
val msg = StartClient(userName + i)
Await.result(rootActor ? msg, timeout.duration).asInstanceOf[ClientStarted] match {
case succ @ ClientStarted(userName) =>
// logger.info("[C][SUCC] Client started: " + succ)
case _ =>
logger.error("Terminated on waiting for response from " + i + "-th actor")
throw new RuntimeException("[C][FAIL] Could not start client: " + msg)
}
}
t1 = System.nanoTime
logger.error("Starting of a single actor of " + n + ": " + ((t1 - t0) / 1000000.0 / n.toDouble) + " ms")
}
The result:
Starting of a single actor of 10000: 0.3642917 ms
There was a message stating that "Slf4jEventHandler started" between "HELOOOOOOOO" and "Starting of a single", so the experiment seems even more realistic (?)
Dispatchers was a default (a PinnedDispatcher starting a new thread each and every time), and it seemed like all that stuff is the same as Thread.start()
was, for a long long time since Java 1 - 500K-1M cycles or so ^)
That's why I've changed all code inside loop, to a new java.lang.Thread().start()
The result:
Starting of a single actor of 10000: 0.1355219 ms
Upvotes: 1
Reputation: 40461
An actor which will receive one message right after its creation and die right after sending the result can be replaced by a future. Futures are more lightweight than actors.
You can use pipeTo
to receive the future result when its done. For instance in your actor launching the computations:
def receive = {
case t: Task => future { executeTask( t ) }.pipeTo(self)
case r: Result => processTheResult(r)
}
where executeTask
is your function taking a Task
to return a Result
.
However, I would reuse actors from a pool through a router as explained in @drexin answer.
Upvotes: 8