riorio
riorio

Reputation: 6816

Spring batch JobRepository location and scaling

From this article we can learn that Spring-Batch holds the Job's status in some SQL repository.

And from this article we can learn that the location of the JobRepository can be configured - can be in-memory and can be remote DB.

So if we need to scale a batch job, should we run several different Spring-batch JARs, all configured to use the same shared DB in order to keep them synchronized?

Is this the right pattern / architecture?

Upvotes: 0

Views: 285

Answers (1)

Mahmoud Ben Hassine
Mahmoud Ben Hassine

Reputation: 31590

Yes, this is the way to go. The problem that might happen when you launch the same job from different physical nodes is that you can create the same job instance twice. In this case, Spring Batch will not know which instance to pick up when restarting a failed execution. A shared job repository acts as a safeguard to prevent this kind of concurrency issues.

The job repository achieves this synchronization thanks to the transactional capabilities of the underlying database. The IsolationLevelForCreate can be set to an aggressive value (SERIALIZABLE is the default) in order to avoid the aforementioned issue.

Upvotes: 1

Related Questions