Reputation: 327
We have main orchestration that has multiple sub orchestration. All root orchestration is of transaction type:none, hence all the sub are also of same nature. Now any exception is caught in a parent scope of main orchestration and we have some steps like logging. The orchestration is activated with a message from App SQL. So every time an exception occurs, say due to something intermittent, like unable to connect to web service. We later go manually re-trigger.
I'm looking at modifying the orch to be self healing, say from exception catch block it reinitialize the messages based on conditions that tell, the issue was intermittent. Something like app issue-null reference, we would not want to resend message, because, the orch is never going to work.
There is a concept called compensation, but that is for transaction based orch- do n steps if any 1 fails, do m other steps(which would do alternate action or cleanup).
The only idea I have is do a look-up based on keywords in exception and decide to resend messages. But I want some1 to challenge this or suggest a better approach
Upvotes: 2
Views: 138
Reputation: 31780
I have always thought that it's better to handle failures offline. So if the orchestration fails, terminate it. But before you terminate, send a message out. This message will contain all the information necessary to recover the message processing if it turns out that there was a temporary problem which caused the failure. The message can be consumed by a "caretaker" process which is responsible for recovery.
This is similar to how the Erlang OTP framework approaches high availability. Processes fail quickly and caretaker processes make sure recovery happens.
Upvotes: 1