Peter C. Glade
Peter C. Glade

Reputation: 629

Quarkus resilience best practice

I have a use case like the following:

One Quarkus microservice is responsible for talking with several other fixed APIs (e.g. ArgoCD REST API, Standard Corporate Driven API) to bring the whole system in the desired state.

The whole request needs to transactional, which means that either all API requests need to be successful or rolled back in case of any error.

If the APIs give errors back, the case is clear for me. "Just revert everything you had done before", but what happens if my Quarkus application crashes?

An example:

Quarkus application receives a POST request on an endpoint which starts the following tasks:

  1. read a resource on REST-API 1
  2. create a resource on REST-API 2 (ArgoCD API)
  3. create a resource on REST-API 3 (Standard Corporate API)

If my Quarkus application dies after step two, it will leave my application in an inconsistent state.

For example, my second request will create an ArgoCD-Application using its REST API, if then the 3rd request fails, I have to delete the created application again to bring the system back to a consistent state.

The LRA approach is not applicable here because the ArgoCD Rest API does not implement the LRA API.

So at least, i have to maintain the state and the compensation logic within my Quarkus application. However, I need to persist the state of my transaction anywhere to recover from it after a failure.

My current solution uses a Redis database besides to persist the state of each transactions until it is finished, but I was wondering if I missed some standard solution matching my use case.

Upvotes: 0

Views: 679

Answers (1)

zaerymoghaddam
zaerymoghaddam

Reputation: 3127

I think there is no easy way to achieve what you're looking for. Of course, it depends on the level of resiliency or isolation that is acceptable on your business. There is actually a dedicated MicroProfile specification named LRA (long running action) only to solve this problem. It's an implementation of the SAGA pattern.

The problem is that it's not enough to only rely on calling the rollback on those APIs. What if those rollbacks fail and many other cases. There is always a need to have a coordinator in such cases to orchestrate the whole process. You can read more about it in the links above or even the Quarkus extension for LRA.

I know it may not necessary solve your problem if for example you don't have access to those API's source to add LRA annotations, but I hope it would give you some ideas about the way people typically solve this.

Upvotes: 1

Related Questions