Reputation: 4772
What does the INFO message of
FailureDetector(akka://MyCluster) - Remove heartbeat connection [akka://[email protected]:35250]
in an Akka cluster mean? I can't seem to find anything in the documentation. I'm seeing this a fair bit when running lots of JVMs with actors on a test machine, but not sure if it's a bad sign requiring some kind of Akka or Linux tuning.
Akka 2.1.4 on Oracle JDK 1.7
Update: Having followed @cmbaxter's advice, I investigated options for tuning heartbeats. I found that increasing/decreasing the timings associated with heartbeats had no effect on the presence of the 'Remove heartbest connection' messages. However, I noticed the 'monitored-by-nr-of-members' configuration setting. I now believe the messages indicate that monitoring of heartbeats from a particular node is being passed from one ActorSystem to another. Hence they indicate the current system simply stating that it's no longer it's own responsibility, rather than indicating any kind of connectivity warning. Indeed, during system start-up the first node recieves a heck of a lot of 'First heartbeat's but then removes most of them, as per the 'monitored-by-nr-of-members' setting, as the load is passed to other nodes.
Upvotes: 0
Views: 874
Reputation: 35463
The message you are seeing is coming from the AccrualFailureDetector
class in Akka. According to the docs:
The nodes in the cluster monitor each other by sending heartbeats to detect if a
node is unreachable from the rest of the cluster. The heartbeat arrival times is
interpreted by an implementation of The Phi Accrual Failure Detector.
My guess here is that a cluster node (running locally, on port 35250) has become unreachable enough times that it has been deemed to no longer be part of the cluster. When that happens, the heartbeat check to that node is removed and thus you see this message. If you believe that this node was not unreachable and thus should not have been removed from the cluster heartbeat, then you might have an issue. Take a look at the Cluster Docs here under the Failure Detector
section for more info on how to tune the failure detection.
Upvotes: 1