Reputation: 2459
in our project, we have a stateful server. The server runs a rule engine (Drools) and exposes functionality using a rest service. It is monitoring system and it is very critical to have an uptime or more less 100%. Therefore we also need strategies to shut down a server for maintainance and to have strategies to be able to continue monitoring of an agent when one server is offline.
The first might be to put a message queue or service bus in front of the drools servers to keep messages that have not been processed and to have mechanisms to backup the state of the server to a database or another storage. This makes it possible to shut down the server for a few minutes to deploy a new version. But the question is, what to do when one server goes offline unexpectedly. Are there any failover strategies for stateful servers, what is your experience? And advice is welcome.
Upvotes: 0
Views: 756
Reputation: 9480
There's no 'correct' way that I can think of. It rather depends on things like:
Some ideas for enabling fail-over:
An additional point to consider when replaying events is that you probably don't want any alerts to be raised to the outside world until you have completed the replay. For instance you probably don't want 50 alert emails sent to say that ApplicationX is down, up, down, up, down, up, ...
I'll assume that a monitoring application might be pushing alerts to the outside world in some form. If you have a hot-hot configuration as in 4, you also need to control your alerts. I would be tempted to deal with this by configuring each to push alerts to its own queue. Then middleware could forward alerts from the secondary monitor to a dead letter queue. Failover would be to reconfigure middleware so that primary alerts go to the dead letter queue and secondary alerts go to the alert channel. This mechanism could also be used to discard events raised during a replay recovery.
Given the complexity and potential mess that can arise from replaying events, for a monitoring application I would probably prefer starting from a clean slate, or going with persisted sessions. However this may well depend on what you are monitoring.
Upvotes: 1