alexanoid
alexanoid

Reputation: 25770

Spring Batch in clustered environment, high-availability

Right now I use H2 in-memory database as JobRepostiry for my single node Spring Batch/Boot application.

Now I would like to run Spring Batch application on two nodes in order to increase performance (distribute jobs between these 2 instances) and made the application more failover.

Instead of H2 I'm going to use PostgreSQL and configure both of the applications to use this shared database. Is that enough for Sring Batch in order to start working properly in the cluster and start distributing jobs between cluster nodes or do I need to perform some additional actions?

Upvotes: 1

Views: 3625

Answers (1)

Mahmoud Ben Hassine
Mahmoud Ben Hassine

Reputation: 31590

Depending on how you will distribute your jobs across the nodes, you might need to setup a communication middleware (such a JMS or AMQP provider) in addition to a shared job repository.

For example, if you use remote partitioning, your job will be partitioned and each worker can be run on one node. In this case, the job repository must be shared in order for:

  • the workers to report their progress to the job repository
  • the master to poll the job repository for workers statuses.

If your jobs are completely independent and you don't need feature like restart, you can continue using an in-memory database for each job and launch multiple instances of the same job on different nodes. But even in this case, I would recommend using a production grade job repository instead of an in-memory database. Things can go wrong very quickly in a clustered environment and having a job repository to store the execution status, synchronize executions, restart failed executions, etc is crucial in such an environment.

Upvotes: 2

Related Questions