How to manage Akka Actor's paths in distributed system?

Question

Suppose I have a the following two Actors

Store
Product

Every Store can have multiple Products and I want to dynamically split the Store into StoreA and StoreB on high traffic on multiple machines. The splitting of Store will also split the Products evenly between StoreA and StoreB.

My question is: what are the best practices of knowing where to send all the future BuyProduct requests to (StoreA or StoreB) after the split ? The reason I'm asking this is because if a request to buy ProductA is received I want to send it to the right store which already has that Product's state in memory.

Solution: The only solution I can think of is to store the path of each Product Map[productId:Long, storePath:String] in a ProductsPathActor every time a new Product is created and for every BuyProduct request I will query the ProductPathActor which will return the correct Store's path and then send the BuyProduct request to that Store ?

Is there another way of managing this in Akka or is my solution correct ?

Brian Topping · Accepted Answer

One good way to do this is with Akka Cluster Sharding. From the docs:

Cluster sharding is useful when you need to distribute actors across several nodes in the cluster and want to be able to interact with them using their logical identifier, but without having to care about their physical location in the cluster, which might also change over time.

There is an Activator Template that demonstrates it here.

To your problem, the concept of StoreA and StoreB are each a ShardRegion and map 1:1 with to your cluster nodes. The ShardCoordinator manages distribution between these nodes and acts as the conduit between regions.

For it's part, your Request Handler talks to a ShardRegion, which routes the message if necessary in conjunction with the coordinator. Presumably, there is a JVM-local ShardRegion for each Request Handler to talk to, but there's no reason that it could not be a remote actor.

When there is a change in the number of nodes, ShardCoordinator needs to move shards (i.e. the collections of entities that were managed by that ShardRegion) that are going to shut down in a process called "rebalancing". During that period, the entities within those shards are unavailable, but the messages to those entities will be buffered until they are available again. To this end, "being available" means that the new ShardRegion responds to a directed message for that entity.

It's up to you to bring that entity back to life on the new node. Akka Persistence makes this very easy, but requires you to use the Event Sourcing pattern in the process. This isn't a bad thing, as it can lead to web-scale performance much more easily. This is especially true when the database in use is something like Apache Cassandra. You will see that nodes are "passivated", which is essentially just caching off to disk so they can be restored on request, and Akka Persistence works with that passivation to transparently restore the nodes under the control of the new ShardRegion – essentially a "move".

How to manage Akka Actor's paths in distributed system?

Answers (1)

Related Questions

How to manage Akka Actor&#39;s paths in distributed system?

Answers (1)

Related Questions

How to manage Akka Actor's paths in distributed system?