Reputation: 2611
I am using camunda community version for one of my workflow project which does kind of orecstartion of microserice flow similar to this, all the features in community versions are enough for my requirement except high availability and auto recovery.
For high availability if I make Database( mySQL) high available as per this guide and two or more instances of spring based camunda manager running behind load balancer would be enough ?
How to recover if camunda accepts the bpnm request and that node failed or crashed after receiving the request ?
in my case each spring based camunda manager gets request and confirm the user with 202(accepted) then camunda will start executing the workflow. So how to auto recover and auto resume that job if node which got the request is crashed?
Upvotes: 4
Views: 1440
Reputation: 1030
Running multiple instances of the engine (=multiple Spring applications) on top of a highly-available database (make sure it supports read-commited, see https://docs.camunda.org/manual/7.12/introduction/supported-environments/#databases) is definitely sufficient to make Camunda highly available.
In case a node crashes after responding 202 you will fall back to "normal" Java/Spring transaction handling. https://docs.camunda.org/manual/7.12/user-guide/process-engine/transactions-in-processes/#transaction-boundaries should help to clarify this.
So if you make sure that you start your workflow instance, probably with an async start event, commit this transaction and just then return 202, you are safe. The only problem that can arise is that you crash before returning 202, which typically leads to a retry on your REST API, for this case you should make sure you start your workflow idempotently.
Upvotes: 4