user3034824
user3034824

Reputation: 47

Handle the crashed remote actor in a cluster

I am new to Akka. I built an Akka cluster. In the cluster, I have one node as the master, which will distribute works to the slave nodes. The master node will first be started. Then the slave nodes will register themslves to the master. If the slave leaves gracefully, the master will receive a message as

message instanceof Terminated

Then the master will do some recovery for the slave node. But if the slave simply crashed, How can I handle it. Currently, the console will print error as "Connection refused". Could anyone tell me how I can catch this error and know the ActorRef of this crashed slave so that the master will do similar recovery for the crashed slave node.

Thank you very much

Upvotes: 1

Views: 1243

Answers (2)

Sergiy Prydatchenko
Sergiy Prydatchenko

Reputation: 988

You can maintain a list (or map) of other node addresses with corresponding ActorRef-s (or actor paths) on them. And you can subscribe to cluster messages (like UnreachableMember) and do some recover when receiving it.

Something like this:

class ClusterRefRecoverExample extends Actor {

  private val membersWithActorRefs = collection.mutable.HashMap[Address, ActorRef]()

  override def preStart() {
    super.preStart()
    val cluster = Cluster(context.system)
    cluster.subscribe(self, classOf[MemberEvent])
    cluster.subscribe(self, classOf[UnreachableMember])
  }

  override def postStop() {
    super.postStop()
    Cluster(context.system).unsubscribe(self)
  }

  def recoverAddress(addr: Address) {
    membersWithActorRefs.get(addr) foreach {
      theRef =>
        // do your recover here
    }
  }

  def removeAddress(addr: Address) {
    membersWithActorRefs.remove(addr)
  }

  def receive = {

    ....

    case UnreachableMember(member) =>      
      recoverAddress(member.address)

    case MemberRemoved(member, _) =>
      removeAddress(member.address)

    case MemberExited(member) =>
        removeAddress(member.address)
  }

}

Upvotes: 1

Viktor Klang
Viktor Klang

Reputation: 26579

From the Cluster documentation:

"Death watch uses the cluster failure detector for nodes in the cluster, i.e. it generates Terminated message from network failures and JVM crashes, in addition to graceful termination of watched actor." - http://doc.akka.io/docs/akka/2.2.3/scala/cluster-usage.html

Upvotes: 1

Related Questions