Hazelcast ITopic and listener crash

Question

I have a multi-node cluster Hazelcast application that uses ITopic's. I'm trying to understand whether, in order for things to be "cleaned up" properly when a node crashes, my application should detect the node crash and remove that node's registration IDs - or whether Hazelcast automatically takes care of that.

By "node crash" I mean that an app that is part of a Hazelcast cluster terminates ungracefully, without calling ITopic.removeMessageListener or HazelcastInstance.shutdown. This could be due to the app crashing or being killed or the host crashing.

Here's the long story, in case it helps. I don't know the internals of Hazelcast and couldn't find anything relevant in the documentation. However, I can think of a couple of ways this "automatic" cleanup could work: 1. On each node, Hazelcast keeps a list of all subscribers, both local and remote. When it detects that another node is unavailable, Hazelcast automatically removes that other node's listeners from the list of ITopic subscribers. 2. On each node, Hazelcast only keeps a list of local subscribers. When a publisher calls ITopic.publish, Hazelcast sends the message to all nodes. Upon receiving the message, Hazelcast on each node calls onMessage on all local subscriber.

Here's a sample scenario. Let's suppose I have a Hazelcast cluster with 2 nodes, A and B. Both node A and node B register listeners to the same ITopic via ITopic.addMessageListener.

Let's suppose that node B crashes without calling ITopic.removeMessageListener or HazelcastInstance.shutdown

Eventually, Hazelcast on node A detects that node B is unavailable.

Now let's suppose that a publisher on node A calls ITopic.publish. Does Hazelcast on A still tries to send the message to the subscriber on B? And let's suppose that after some time node B is restarted, and a publisher on A calls ITopic.publish. Does Hazelcast on A still tries to send the message to the old subscriber on B?

Thank you in advance.

noctarius · Accepted Answer

Hazelcast will remove listeners for dead nodes automatically on death-detection. If this doesn't happen (I guess there might be a reason for you to ask) this is a bug.

Hazelcast will also not try to send events to the dead node after it was recognized as dead, that said it means that events being send in abstinence of node B won't be redelivered whenever the node is coming back. There is no correlation between the old, dead node B and the newly connected one.

Does that answer the question? :)

Hazelcast ITopic and listener crash

Answers (1)

Related Questions