user93796
user93796

Reputation: 18379

spring batch :How it works in distributed environemnt

I am planning to use spring batch in a distributed environment. to do some batch processing tasks.


Now when i mean distributed env i mean i have set of boxes with fronteneding web service. Loadbalancer distributes then distributes the job to boxes.


Now I have few questions:
1)What happends if job is terminated half way(say the box got restarted).Will spring batch automatically restart the job?Or do i need to write my own custom watcher and then call spring batch api to restart job?
2)If spring back has this kind of auto restart .Can 2 boxes pick and execute same job at once? Is this the case?

Upvotes: 2

Views: 2301

Answers (1)

Angular University
Angular University

Reputation: 43087

Spring Batch has four strategies to handle scalability, see here for further details:

  • Multi-threaded Step (single process)
  • Parallel Steps (single process)
  • Remote Chunking of Step (multi process)
  • Partitioning a Step (single or multi process)

Yours is a multi-process scenario, so you can choose between step remote chunking and step partioning, depending on the cost of the read part compared to the process/write.

But in both cases there cannot be two instances that do duplicate work, it's all designed to avoid that. that could only happened if by accident deploying one of the two single process mechanisms in different machines, that would cause the problem you mention.

Restart logic is also foreseen, see here the Restartability section for further details.

Upon restart the job will go on reading, processing and writing the next chunk of data. If the reader/processor/writer are configured/written taken into that the task is chunked, it will all work out of the box.

Usually it involves including in the write part marking the read items in that chunk as 'processed'.

Upvotes: 1

Related Questions